Mysql master-slave Data Consistency Verification and error correction tools

Source: Internet
Author: User
Tags install perl
Directory 1, Overview 2, percona-tooldit tool installation 3, new user 4, pt-table-checksum use 5, pt-table-sync use 6, personal summary 1. Overview if you are an O & M engineer, if you deploy the mysql system in the production environment, and if your online mysql is based on the master-slave replication architecture, congratulations, it may bring

Directory 1, Overview 2, percona-tooldit tool installation 3, new user 4, pt-table-checksum use 5, pt-table-sync use 6, personal summary 1. Overview if you are an O & M engineer, if you deploy the mysql system in the production environment, and if your online mysql is based on the master-slave replication architecture, congratulations, it may bring

Directory

1. Overview

2. Install percona-tooldit

3. Create a user

4. Use pt-table-checksum

5. Use pt-table-sync

6. Personal Summary

1. Overview

If you are an O & M engineer, if you have deployed the mysql system in the production environment, and if your online mysql is based on the master-slave replication architecture, congratulations, it may bring you "Bad Luck" with inconsistent Master/Slave Data ".

Because of the native features of the mysql replication architecture, data on the master and slave servers cannot be "synchronized", so latency is inevitable, even on less busy servers, the slave database can catch up with the progress of the master database during busy hours, the data written to the master database may also be inconsistent with the data written to the slave database due to factors such as server crash, illegal shutdown, and program bug. In this case, mysql does not have a mechanism to check master-slave data consistency. for users, you do not know that the master-slave data is inconsistent.

Therefore, a tool is needed to solve this problem. The pt-table-checksum tool in the percona-toolkit tool is efficient without affecting mysql performance, A tool that can detect master-slave data inconsistency. When data inconsistency is generated, the percona-tools tool also provides the pt-table-sync tool to fix inconsistent data, which saves the trouble of re-deploying the slave server.

However, in a real production environment, these two tools still have some limitations. To be precise, the asynchronous replication architecture of mysql leads to the limitations of the tools, because the slave database is slower than the master database, the data in the table on the master database is often inconsistent with that in the slave database, this is not due to slave database latency, so these two tools are best used in the following scenarios:

A) when the slave server is upgraded to the master server, data consistency check must be performed with the old master server when the new master server is launched.

B) Check Data Consistency after data migration

C) Check the consistency after the slave database is updated due to misoperation.

D) Check the data consistency in the plan

The following example shows how to build a replica Environment Based on http://zhaochj.blog.51cto.com/410705/1635982.

2. Install percona-toolkit

First install the dependent package and percona-toolkit:

[root@master ~] yum -y install perl perl-devel libaio libaio-devel perl-Time-HiRes perl-DBD-MySQL perl-IO-Socket-SSL[root@master ~] rpm -ivh percona-toolkit-2.2.13-1.noarch.rpm

3. Create a user

Create a non-root user to perform data consistency detection, pt-table-checksum and pt-table-sync must be connected to the slave database for data viewing, data modification, and other operations. Therefore, create a user in the master database:

mysql> GRANT select,insert,update,delete,create,process,super,replication slave ON *.* TO monitor@'192.168.0.%' IDENTIFIED BY '111111';Query OK, 0 rows affected (0.01 sec)mysql> GRANT select,insert,update,delete,create,process,super,replication slave ON *.* TO monitor@'127.0.0.1' IDENTIFIED BY '111111';Query OK, 0 rows affected (0.00 sec)mysql> FLUSH PRIVILEGES;Query OK, 0 rows affected (0.00 sec)

After the user is created, test whether the master database and slave database can be normally connected (because this user also needs to connect to the master database, the above two users are created ). This user really requires a lot of permissions, but there is no way. This is because the permissions required for the working principles of the two tools in percona-toolkit can be directly granted to ALL for convenience.

4. Use pt-table-checksum

Use the tb1 table in the mydb1 database as a test. view the tb1 content in the master database:

mysql> SELECT * FROM mydb1.tb1;+----+-------+------+| id | name  | age  |+----+-------+------+|  1 | tom   |   12 ||  2 | jem   |   23 ||  3 | jason |   29 ||  4 | aaa   |   30 ||  5 | b     |   69 |+----+-------+------+5 rows in set (0.01 sec)

View the content of mydb1.tb1 on the slave database:

mysql> select * from mydb1.tb1;+----+-------+------+| id | name  | age  |+----+-------+------+|  1 | tom   |   12 ||  2 | jem   |   23 ||  3 | jason |   29 ||  4 | aaa   |   30 ||  5 | b     |   69 |+----+-------+------+5 rows in set (0.00 sec)

In this case, the master and slave data are consistent. Use the pt-table-checksum tool to test the output result:

[Root @ master ~] # Pt-table-checksum -- nocheck-replication-filters -- replicate = mydb1.checksums -- databases = mydb1 h = 127.0.0.1, u = monitor, p = 111111 Replica slave has binlog_format ROW which cocould cause pt-table-checksum to break replication. please read "Replicas using row-based replication" in the LIMITATIONS section of the tool's documentation. if you understand the risks, specify -- no-check-binlog-format to disable t His check. # An error is reported because the binary log in my mysql environment is ROW-based, that is, 'binlog _ format = row'. If it is a ROW-based replication environment, percona does not officially recommend using the pt-table-checksum tool for data consistency check, but it also provides an option to skip this check.

Meanings of common options:

-- Nocheck-replication-filters: we use -- databases to specify the database to be checked.

-- Replicate: Write the verification information to the specified table.

-- No-check-binlog-format: do not check the binary log file format

-- Replicate-check-only: only information with inconsistent data is displayed.

-- Databases: Specifies the database for verification. Multiple databases are separated by commas (,).

-- Tables: Specifies the table to be verified. Multiple tables are separated by commas (,).

H: host, the IP address of the master server

U: Account

P: Password


Add the "-- no-check-binlog-format" option and test it again:

[Root @ master ~] # Pt-table-checksum -- nocheck-replication-filters -- replicate = mydb1.checksums -- no-check-binlog-format -- databases = mydb1 -- h = 127.0.0.1, u = monitor, p = 111111 ts errors diffs rows chunks skipped time TABLE04-21T18: 00: 59 0 0 5 1 0 0.280 mydb1.tb104-21T18: 00: 59 0 0 2 1 0 0.331 mydb1.tb2 # observe the column "DIFFS". If the data is inconsistent, the value is not "0.

After running the preceding command, an error similar to "no slaves were found" may be reported, because the slave server cannot be connected. After running the preceding command, pt-table-checksum will automatically search for the Master's slave server using a recursive method after connecting to the master mysql. First, Run "show processlist ;", then run "show slave hosts" to find out. If you encounter an error that cannot connect to the slave server. cnf adds "report_host = slave server IP" to actively tell the master server that it is the master slave, add the "-- recursion-method = hosts" option to the pt-table-checksum command, in this way, the master server can use the "show slave hosts" command to view the slave server IP address.


We have made the Master/Slave Data inconsistent manually. On the slave server, we changed the age column with the ID number 5 in the tb1 table to 20:

mysql> update mydb1.tb1 set age=20 where id=5;Query OK, 1 row affected (0.01 sec)Rows matched: 1  Changed: 1  Warnings: 0mysql> select * from mydb1.tb1;+----+-------+------+| id | name  | age  |+----+-------+------+|  1 | tom   |   12 ||  2 | jem   |   23 ||  3 | jason |   29 ||  4 | aaa   |   30 ||  5 | b     |   20 |+----+-------+------+5 rows in set (0.00 sec)

In this way, the master and slave data are inconsistent. We run the pt-table-checksum tool on the master server to test it:

[Root @ master ~] # Pt-table-checksum -- nocheck-replication-filters -- replicate = mydb1.checksums -- no-check-binlog-format -- recursion-method = hosts -- databases = mydb1 h = 127.0.0.1, u = monitor, p = 111111 ts errors diffs rows chunks skipped time TABLE04-21T18: 27: 00 0 1 5 1 0 0.307 mydb1.tb104-21T18: 27: 01 0 0 2 1 0 0.048 mydb1.tb2 # the value of "DIFFS" in the table tb1 is changed to "1 ".

The pt-table-sync tool is now available.

5. Use pt-table-sync

You can use the "-- print" option to check the inconsistency between the master and slave nodes:

[Root @ master ~] # Pt-table-sync -- replicate = mydb1.checksums h = 127.0.0.1, u = monitor, p = 111111 h = 192.168.0.202, u = monitor, p = 111111 -- charset = utf8 -- printREPLACE INTO 'mydb1 '. 'tab1' ('id', 'name', 'age') VALUES ('5', 'B', '69')/* percona-toolkit src_db: mydb1 src_tbl: tb1 src_dsn: A = utf8, h = 127.0.0.1, p = ..., u = monitor dst_db: mydb1 dst_tbl: tb1 dst_dsn: A = utf8, h = 192.168.0.202, p = ..., u = monitor lock: 1 transaction: 1 changing_src: mydb1 . Checksums replicate: mydb1.checksums bidirectional: 0 pid: 3205 user: root host: master */; # The output information above indicates that the age value of row id = 5 on the slave database should be 69. # There are two groups of commands: "h =, u =, p =". The first group specifies the master server, and the second group points to the slave server.

Meanings of common options:

-- Replicate =: indicates the checksums table generated based on the pt-table-checksum tool to fix problematic data.

-- Databases =: indicates the database to be synchronized. Multiple databases are separated by commas (,).

-- Tables =: indicates the data table for synchronization. Multiple Data tables are separated by commas (,).

H =: Server Host Name

U =: Account

P =: Password

-- Print: print only, but do not execute the command

-- Execute: execute the command


After confirming that the data is inconsistent, replace the "-- print" option with "-- execute" to execute the replacement statement:

[root@master ~]# pt-table-sync --replicate=mydb1.checksums h=127.0.0.1,u=monitor,p=111111 h=192.168.0.202,u=monitor,p=111111 --charset=utf8 --execute

Run the pt-table-checksum tool to check whether the data of the two tables is consistent:

[root@master ~]# pt-table-checksum --replicate=mydb1.checksums --nocheck-replication-filters --no-check-binlog-format --databases=mydb1 h=127.0.0.1,u=monitor,p=111111            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE04-21T21:42:31      0      0        5       1       0   0.309 mydb1.tb104-21T21:42:32      0      0        2       1       0   0.304 mydb1.tb2

In this way, the data is successfully corrected.

6. Personal Summary

These two tools are generally used in combination to make up that mysql does not have a Data Consistency Verification Mechanism, so that O & M personnel can better maintain them in the master-slave replication architecture. In the pt-table-checksum tool, we recommend that you use statement-based replication. Statement-based and row-based replication have their own advantages and disadvantages, if the pt-table-checksum tool is often used in later maintenance, I think we should set binlog_format to statement or mixed.

In the end, if the data inconsistency between the master and slave is generated in the production environment, rather than the delay, when using these tools to operate data, remember to back up the source data, whether the source data is intact or some data has been damaged, you must back up the source data before performing data restoration. in a high-pressure environment such as data recovery, who can ensure that all your operations are standard and correct? If the operation fails, you have a chance to roll back.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.