In the previous article, pt-table-checksum was used to verify data consistency. This section describes how to verify master-slave data consistency. when determining whether master-slave data is inconsistent, you must consider how to fix it. Push here
In the previous article, pt-table-checksum was used to verify data consistency. This section describes how to verify master-slave data consistency. when determining whether master-slave data is inconsistent, you must consider how to fix it. Push here
In the previous article, pt-table-checksum was used to verify data consistency. This section describes how to verify master-slave data consistency. when determining whether master-slave data is inconsistent, you must consider how to fix it. We recommend another important tool of percona-toolkit: pt-table-sync.
Percona-toolkit installation and introduction
Percona-toolkit: pt-summary, pt-mysql-summary, pt-config-diff, and pt-variable-advisor
Introduction to pt-table-sync
As the name suggests, it is used to repair data inconsistency between multiple instances. It can restore the master-slave data to the final consistency, or repair multiple unrelated database instances through the Application dual-write or multi-write. At the same time, it also integrates the pt-table-checksum verification function, which can be repaired while being verified, or repaired based on the calculation results of pt-table-checksum.
Working Principle 1. Calculation of the checksum value of a single row
Like pt-table-checksum, the calculation Logic first checks the structure, obtains the Data Types of each column, converts all data types to strings, and then uses concat_ws () function to calculate the checksum value of the row. Checksum is calculated using crc32 by default.
2. Calculation of the checksum value of the data block
Like the pt-table-checksum tool, pt-table-sync intelligently analyzes table indexes and splits table data into several chunks. chunks are used as the unit for calculation. It can be understood that the data of all rows in the chunk is spliced, and then the crc32 value is calculated to obtain the checksum value of the chunk.
3. Bad Block Detection and repair
In the previous two steps, pt-table-sync and pt-table-checksum have the same algorithm and principle. Next, it starts to be different:
Pt-table-checksum is only for verification. Therefore, it stores the checksum result to the statistical table and records the executed SQL statement to the binlog. The task is completed. Statement-level replication transfers the computing logic to the slave database and performs the same computing in the slave database. The pt-table-checksum algorithm does not care about the slave database latency. The latency is calculated in the same way (some colleagues do not understand this. For details, refer to my previous article ), it will not affect the correctness of the calculation results (but we will still check the delay because too much delay will affect the business, so we always need to add-max-lag to limit the traffic ).
Pt-table-sync is different. It first needs to complete the checksum value calculation of the chunk. Once the checksum value of the same chunk on the master and slave is found to be different, it goes deep into the chunk and compares and fixes problematic rows row by row. Its computing logic is described as follows (for example, to fix data inconsistency between the master and slave structures, business dual writes are more complicated to fix-due to issues related to conflict resolution and benchmark selection, it is limited to space, ):
Important options Security Options
-[No] check-triggers: check whether a trigger exists. If yes, a warning is returned.
-[No] foreign-key-checks the primary and foreign key constraints by default. If any, a warning is displayed.
-[No] unique-checks: Check whether a unique index exists. If no index exists, a warning is displayed.
Filter options
-Ignore-databases
-Ignore-engines
-Ignore-tables
Other options
-Replicate = s is used in combination with pt-table-checksum. It is repaired only and not verified. Results verified before pt-table-checksum
-Bidirectional synchronization. Generally, the data in the master database prevails. If bidirectional synchronization is enabled, it is complicated to define conflict resolution rules.
Example
Assume 10.55.55.55 is the master database and 10.73.73.73 is its slave database. The port number is 3306.
1. Verify first:
PTDEBUG = 1. /pt-table-checksum -- user = user -- password = pass -- host = 10.55.55.55 -- port = 3306 -- databases = elink -- tables = my_cms_10 -- recursion-method = processlist 2. based on the verification results, only the inconsistency between the slave database 10.73.73.73 and the master database is fixed:
PTDEBUG = 1. /pt-table-sync -- execute -- replicate percona. checksums -- sync-to-master h = 10.73.73.73, P = 3306, u = user, p = pass 3. after the repair, verify again. Execute the statement in step 1.
4. check the repair result: log on to 10.73.73.73 and run the following SQL statement. If the returned result is null, the repair is successful: select * from percona. checksums where master_cnt <> this_cnt OR master_crc <> this_crc or isnull (master_crc) <> ISNULL (this_crc)
Summary
The pt-table-sync tool is complex and powerful. It modifies the database content, which may cause security incidents. To ensure security, we recommend that you use pt-table-checksum to regularly check data consistency, manually re-build the slave database with severe damage, and finally Use pt-table-sync to fix the remaining slave database.