How to quickly compare the data on two different MySQL instances and find the difference

Source: Internet
Author: User

How to quickly compare the data on two different MySQL instances and find the difference

In MySQL O & M, R & D colleagues want to compare the data on two different instances and find out the difference. In addition to the primary key, they also need to compare each field. How can this problem be solved?

The first solution is to write a program to extract and compare each row of data on two instances. This is theoretically feasible, but the comparison takes a long time.

The second solution combines all the fields of each row of data, obtains the checksum value, and then compares them according to the checksum value. It looks feasible and tries.

First, merge the values of all fields. Use the CONCAT function provided by MySQL. If the CONCAT function contains a NULL value, the final result is NULL. Therefore, use the IFNULL function to replace the NULL value, for example:

CONCAT(IFNULL(C1,''),IFNULL(C2,''))

There are many rows to add to the table. You can use information_schema.COLUMNS to manually add scripts:

# Obtain the concatenation string SELECTGROUP_CONCAT ('ifnull (', COLUMN_NAME,', ''') ') FROM information_schema.COLUMNS WHERE TABLE_NAME = 'table _ name ';

Suppose we have a test table:

CREATE TABLE t_test01(    id INT AUTO_INCREMENT PRIMARY KEY,    C1 INT,    C2 INT)

We can splice the following SQL:

SELECTid,MD5(CONCAT(IFNULL(id,''),IFNULL(c1,''),IFNULL(c2,''),)) AS md5_valueFROM t_test01

Run the following command on the two instances and compare the results with beyond compare. This makes it easy to find different rows and primary key IDs.

For tables with a large amount of data, the execution result set is also very large, which is difficult to compare. Then, first try to narrow down the result set, the md5 values of records in multiple rows can be combined to obtain the MD5 values. If the MD5 values are the same at the end, these rows are the same. If the md5 values are different, it indicates that there are differences, then compare the rows by row.

Suppose we compare the results by a group of 1000 rows. If we need to merge the grouped results, we need to use the GROUP_CONCAT function. Note that the GROUP_CONCAT function adds sorting to ensure the order of the merged data, the SQL statement is as follows:

SELECTmin(id) as min_id,max(id) as max_id,count(1) as row_count,MD5(GROUP_CONCAT(MD5(CONCAT(IFNULL(id,''),IFNULL(c1,''),IFNULL(c2,''),)) ORDER BY id))AS md5_valueFROM t_test01GROUP BY (id div 1000)

The execution result is:

min_id    max_id    row_count    md5_value0        999        1000         7d49def23611f610849ef559677fec0c1000     1999       1000         95d61931aa5d3b48f1e38b3550daee082000     2999       1000         b02612548fae8a4455418365b3ae611a3000     3999       1000         fe798602ab9dd1c69b36a0da568b6dbb 

When the difference data is small, even if we need to compare tens of millions of data, we can easily locate the 1000 data differences based on min_id and max_id, then compare the MD5 values by line to find the different lines.

Final comparison diagram:

##=================================================== ====================================== ##

PS:

When using GROUP_CONCAT, You need to configure the MySQL variablegroup_concat_max_lenThe default value is 1024. the excess part is in the phase.

Reference: https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.