Pt-table-checksum is now the best tool to view master-slave consistency
Let's start with an example to help you get started quickly.
execute in the main library to create the execution user: MySQL>GRANT SELECT, PROCESS, SUPER,REPLICATIONSLAVE,CREATE,DELETE,INSERT,UPDATE on *.* to 'USER'@'Master_host'Identified by 'PASSWORD';
Note: Create a user, these permissions are necessary, or the subsequent execution will be an error, of course, if you do not want to grant so many permissions, then go to the document, the permissions corresponding to the work of the first self-dried or directly on the command line, note If you do not want to set create permissions, Take a look at the following parameter description:--replicateshell>./Pt-Table-Checksum--host= ' master_host '--user= ' user '--password= ' password '--port= ' Port '--databases=zabbix--ignore-tables=ignore_ Table--recursion-method=processlist
(2) Recursion-method if not set, will be error: Diffs cannot be detected because no slaves were found. Its parameters are four: Processlist/hosts/dsn=dsn /no, used to determine the way to find slave is show full Processlist or show slave hosts or command line directly designated or not ready to find from the library, see the following parameters of the
!!! One place to pay special attention is that the master/slave required to be checked must be based on the statement-based copy mode, if not statement-based, Then add the parameter--no-check-binlog-format to avoid binlog format check, need to confirm in advance!
!!! In addition, although the tool has a lot of checks to ensure its use of security, but the use of the process will still have a lock and cause blocking, so it is best not to use when the system is particularly busy
!!! If there is inconsistency, after synchronizing the data with Pt-table-sync, want to re-check again, it is better to truncate TABLE checksums to avoid the checksums table behind the previous data interference;
I. Parameters connecting the master and slave libraries:
--host--socket--user--password--pid--port
Two. Specify parameters for the comparison range
1. Specify the Library
--databases/--ignore-databases These libraries are ignored in the library/comparison process to compare
--databases-regex/--ignore-databases-regex Ibid, but can be matched with regular matches
2. Specify the table
--tables/--ignore-tables These tables are ignored during the table/comparison to compare
--tables-regex/--ignore-tables-regex Ibid, but can be matched with regular matches
3. Specify the column
--columns/--ignore-columns These columns are ignored during the column/comparison to compare
4. Directly Specify the table range
--where directly specify the range to compare in the table
5. According to the engine selection table
--engines/--ignore-engines Compare the tables that contain these engines in the specified engine table/comparison process
Three. Specify parameters for behavior after a connection break
--resume if the master-slave consistency check is interrupted, you can use this parameter to enable the tool to continue checking from the last table checked at the last interrupt.
--retries If there is a non-fatal interruption in the inspection process, such as a kill or a delay from the library, the tool will automatically attempt to re-connect when the parameter is specified.
Four. Parameters to be focused on
1.--[no]check-binlog-format said, no more repetition.
2.--recursion-method
There are four parameters: processlist/hosts/DSN=dsn/no, the default is processlist,hosts, but it is best to specify it, it is recommended to specify--recursion-method=processlist, No generally not used
when using the DSN =dsn method, you need to first create a table in the library, such as building a DNSN table in the Percona library .
The Build Table statement is:
create table ' DSNs ' (' id ' int ( not null auto_increment, ' parent_id ' int (11 ) default null , ' DSN ' varchar (not null , primary key (' id '));
Build and plug in the master from copying information data, such as: INSERT INTO Table DSNs (DSN) values (H=slave_host,u=repl_user,p=repl_password,p=port);
You can then use the DSN method: The command is:--recursion-method Dsn=d=percona,t=dsns.
3.--replicate
Used to specify the name of the table that holds the result of the calculation, by default percona.checksums, the tool automatically creates the Library Percona and table checksums and enters the checksum check result into the table, and if you use that parameter to specify the table, the table structure must be:
CREATE TABLE checksums ( db char (+) not null, tbl char (+) not null, chunk int Not NULL, chunk_time float null, chunk_index varchar ($) null, Lower_ Boundary text null, upper_boundary text null, THIS_CRC char (+) not NULL, this _cnt int not null, MASTER_CRC char (+) null, master_cnt int null, TS timestamp not NULL, PRIMARY KEY (db, TBL, chunk), INDEX ts_db_tbl (TS, DB, tbl)) Engine=innodb;
Note that the storage engine settings, if the check table is a InnoDB table, set the InnoDB engine, if the check table and checksums table engine inconsistencies, such as MyISAM and InnoDB, respectively, will cause replication errors: "Different error on Master and slave. "!!!
Five. Other parts of the parameters are detailed:
1.--[no]check-replication-filters
The default is checked to be used during the master-slave copy process: Ignore. Filter out the table, the check will break and exit, if you want to avoid this check can be set--no-check-replication-filters
2.--chunk-index (type:string)
By default, the tool will select the most appropriate index to explain determine the size of the chunk, but if you want to do it with a different index, you can specify it with this parameter, and the tool will add the specified index in the form of force index.
3.--chunk-index-columns (Type:int)
Can be used to specify a composite index using the first few columns to assist in chunking
4.--chunk-size
Directly determine the size of the chunk, the default 1000 rows of data, but not recommended, it is recommended to use--chunk-time instead
5.--chunk-time
The default is 0.5 seconds, and the tool calculates the number of rows of data (that is, chunk) that can be processed at that specified time, depending on how busy the current system is running, which is more flexible
6.--[no]empty-replicate-table
By default yes, delete the table information already in the checksums table before each checklist to facilitate subsequent re-insertion of the new check information
7.--float-precision (Type:int)
Set the rounding method of floating-point numbers to avoid different versions or other specific cases, the master-slave between the rounding of floating-point numbers caused by different ways to detect inconsistencies, If you specify a value of 2, for example, then the values 1.008 and 1.009 'll be rounded to 1.01, and'll checksum as equal
8.--function
function to calculate checksum value, default is CRC32, other fnv1a_64, Murmur_hash, SHA1, MD5, etc.
9.--max-lag
Default 1S, the master-slave maximum delay, over this delay time, will stop to wait for synchronization from the library, the method is determined to take the value of Seconds_behind_master
Ten.--progress
After the specified parameters can be set in the execution of the operation of the process output to stderr, such as master-slave delay from the library waiting, wait time, etc., the specified time followed by two parameter values, the default is "time,30", the previous parameter is: percentage, time, or iterations; The last specified percentage, the specific time or the number of intervals
Six. Operating principle
Pt-table-checksum uses the CRC32 algorithm for data validation by default, and the tool needs to set Binlog to statement format, is because the tool can be derived from the master-slave consistency depends on the statement pattern on the basis of the same SQL statements on the master-slave database, the execution of the results of the main library after the SQL statement passed from the library, the library after the execution of a pass, but also get their own results, the most important statement is:
SELECT COUNT(*) asCntCOALESCE(LOWER(CONV (Bit_xor (CAST(CRC32 (Concat_ws ('#', various column names)) asUNSIGNED)),Ten, -)),0) asCRC from`Database`.`Table' ForceINDEX(`PRIMARY`)WHERE(' ID '>= '1')) and(' ID '<= ' +'))
Note: The Where condition calculates the range to be performed based on the system's busy level
CNT is the number of rows currently checked by the block, unsigned is the check value of the block data calculated
If the main library and the two values from the library are the same, then the data is consistent, if not the same, then the Lord is never consistent, of course, the character sets, floating-point numbers and other problems need to be avoided in order to avoid the wrong
The tool processes the results of the master and slave to the checksums table and renders some results in the screen output, work over
Pt-table-checksum interpretation