Mysql Cluster data consistency check Mysql Cluster
BitsCN.com
Currently, mysql is in full swing in the Internet industry. many large websites are using MySQL databases to build a mysql master-slave cluster to implement high-performance and high-availability storage solutions. A common feature of a mysql Cluster is to synchronize data between the master and slave databases through replication to ensure data consistency between the master and slave databases. In this way, read/write splitting can be ensured. The Slave Database shares the pressure on the master database to improve the availability and performance of the entire cluster.
Why is data consistency check required? Most of the mysql services are built on PC clusters, especially when the cluster reaches a certain scale, hardware failure is almost inevitable. Mysql replication is an asynchronous replication. when the host fails, data may be lost, resulting in data inconsistency between the master and slave databases and the external services cannot be normally provided. In addition, when the existing PC cluster capacity is insufficient, you need to resize the cluster, which involves data migration. Migration generally includes full migration and incremental migration. in the case of non-stop services, after the data is migrated, the data consistency needs to be verified to ensure that the business will not be affected after the migration.
What is Data consistency? Here we only target mysql or relational databases. consistency mainly includes two aspects: table structure consistency and data content consistency. In general, the table structure changes are relatively small, and the probability of inconsistency is also very small, even if it is checked, it is relatively easy. as a result, there are many situations where the data content is inconsistent, therefore, we are more concerned about data content consistency.
How to implement data consistency check? One idea is to compare the tables of the master database and the slave database with fields row by row. The other idea is to calculate the checksum for the master database and the slave database without comparing fields row by row, determine whether the checksum is the same and whether the data in the master and slave databases is consistent. Both ideas are simple. The first approach is highly accurate, but the performance is poor, because a large number of result sets are returned, resulting in a large number of network IO and disk IO. The second approach is the opposite, better performance, less IO, and more CPU resources (computing checksum). The correctness is not as good as the first approach. However, considering that data is constantly changing in the production environment, it is not that simple. By locking the table, we can ensure that the data is static during verification. after successful verification, we can unlock the table. Mysql's built-in command checksum table ensures that data is static by locking the TABLE. This method is good for small tables with a small access Volume. if the table is very large and the verification takes a long time, the production environment cannot tolerate it. Since tables need to be locked to maintain the static state, can the table lock time be shortened? Pt-table-checksum locks only some rows at a time by sharding the table. in this way, only some rows are locked at a time during the verification process, reducing the impact on services.
Percona's pt-table-checksum is widely used in the industry. I will introduce the use and principles of this tool in detail below, and analyze its shortcomings and possible improvements.
The pt-table-checksum tool executes a checksum SQL statement on the master database, and then copies the statement, the same statement will be executed in the slave database (pt-table-checksum requires that the replication work be in statement-level replication mode ). Use the replace... select statement to store the checksum result in the result table, and compare the number and checksum of records in the same block of the master database and slave database to determine whether the data in the master and slave databases is consistent. Note that pt-table-checksum is not only a checksum for a table, because if the table is very large, it will cause a great load on the DB and affect normal business. One CHECKSUM of a TABLE degrades to the checksum table of the mysql command, which requires not only locking the TABLE, but also inaccurate. Pt-table-checksum divides the table into several parts based on the block size set by the user, and calculates a checksum for each block. In this way, even if the table is very large, only part of the records will be locked after the partition, and the load on the DB will be greatly reduced. Because multiple tables can be verified concurrently, the verification efficiency can be greatly improved. the parameter-max-load can be used to prevent excessive load.
Pt-table-checksum can basically meet our daily needs, but it still needs to be improved. First, it only supports table-level concurrency. when checking a large table, it takes a lot of time. In addition, when multiple tables are executed in parallel, the degree of parallelism cannot be set through parameters, but indirectly set through -- max-load. Second, although the checksum generation through blocks accelerates the verification speed, the 1000 rows calculate a 4-byte checksum value (a block of 1000 rows by default), which is highly likely to cause conflicts, although the checksum algorithm designed by pt-table-checksum is complex. Finally, because the checksum and SQL statements are executed on the master database and the slave database respectively, there is a certain time difference. if new data is written during this time period, a false positive will occur. If you are interested, you can further optimize it.
Finally, I will briefly introduce the use of pt-table-checksum, about the parameter configuration inside I will not list one by one, interested students can refer to the http://www.percona.com/doc/percona-toolkit/2.2/pt-table-checksum.html
1. create a user for verification and authorize
Grant all privileges on *. * to ptcheck @ '%' identified by 'ptcheck ';
2. test the table_pt_check table structure.
Create Table: create table 'Table _ pt_check '(
'C1 'int (11) not null AUTO_INCREMENT,
'C2 'int (11) default null,
Primary key ('C1 ')
) ENGINE = InnoDB AUTO_INCREMENT = 26672747 default charset = utf8
3. check the table_pt_check table in the chuck database.
Pt-table-checksum -- host = '2017. 0.0.1 '-- user = 'ptcheck' -- password = 'ptcheck' -- port = 3306 -- databases = 'Chuck' -- tables = 'Table _ pt_check '-- replicate = test. checksums
-- Replicate = test. checksums, which specifies that the verification result is stored in checksums of the test database. After performing the preceding three steps, you can check whether the data in the master and slave databases is consistent.
The structure of the table stored in the validation result is as follows:
Create Table: create table 'checksums '(
'DB' char (64) not null, // database name
'Tbl' char (64) not null, // table name
'Chunk' int (11) not null, // part number
'Chunk _ time' float default null, // The part execution time.
'Chunk _ INDEX' varchar (200) default null, // The index, primary key index, or unique index used by the partition
'Lower _ boundary 'text, // lower bound value of the part
'Upper _ boundary 'text, // upper bound value of the part
'This _ crc 'char (40) not null, // The hash value of the chunk.
'This _ cnt 'int (11) not null, // number of records in each part
'Master _ crc 'char (40) default null, // hash value of blocks on the master
'Master _ cnt 'int (11) default null, // number of records on the master
'Ts' timestamp not null default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
Primary key ('DB', 'tbl', 'chunk '),
KEY 'ts _ db_tbl '('ts', 'DB', 'tbl ')
) ENGINE = InnoDB default charset = utf8
The SQL statement used to verify consistency between the master and slave databases is as follows:
SELECT db,
Tbl,
Sum (this_cnt) AS total_rows,
Count (*) AS chunks
FROM test. checksums
WHERE (master_cnt <> this_cnt
OR master_crc <> this_crc
OR Isnull (master_crc) <> Isnull (this_crc ))
Group by db, tbl;
The -- explain parameter can be used to display the SQL statement in the execution process of pt-table-checksum:
Replace INTO 'test'. 'checksums' (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc)
Select 'Chuck ', 'Table _ pt_check', '7', 'primary ', '123', '123', COUNT (*) AS cnt, COALESCE (LOWER (CONV (BIT_XOR (CAST (CRC32 (CONCAT_WS ('#', 'C1', 'C2', CONCAT (ISNULL ('C2 ')))) as unsigned), 10, 16), 0) AS crc FROM 'Chuck '. 'Table _ pt_check 'force index ('primary') WHERE ('C1 '> = '000000') AND ('C1' <= '000000 '))
Note: BIT_XOR is a key function used to calculate the checksum. This aggregate function is used to include each column in each row in a block into the computing object, theoretically, a checksum can be used to determine whether the data in the master and slave parts is consistent.
BitsCN.com