[Original] How to check consistency between two tables at the database level

Source: Internet
Author: User
In general, how can we check whether the content of the two tables is consistent? such requirements are mostly reflected on the slave machine to ensure data consistency. There are only two methods. The first is to start with the database, and the second is to start with the application.

In general, how can we check whether the content of the two tables is consistent? such requirements are mostly reflected on the slave machine to ensure data consistency. There are only two methods. The first is to start with the database, and the second is to start with the application.

In general, how can we check whether the content of the two tables is consistent? such requirements are mostly reflected on the slave machine to ensure data consistency. There are only two methods. The first is to start with the database, and the second is to start with the application. I have listed some ways to solve such problems at the database level.

Of course, the first step is to check whether the number of records is consistent; otherwise, no other method is needed.

Here we use two tables t1_old and t1_new for demonstration.

TABLE Structure: create table t1_old (id int (11) not null, log_time timestamp default null); create table t1_new (id int (11) not null, log_time timestamp default null ); the number of records in both tables is 100. Mysql> select count (*) from t1_old; + ---------- + | count (*) | + ---------- + | 100 | + ---------- + 1 row in set (0.31 sec) mysql> select count (*) from t1_new; + ---------- + | count (*) | + ---------- + | 100 | + ---------- + 1 row in set (0.00 sec)


Method 1: Use addition and deduplicate.

Since Union itself can sort the records of the upper and lower connections in a unique order, this detection is very simple. Mysql> select count (*) from (select * from t1_old union select * from t1_new) as T; + ---------- + | count (*) | + ---------- + | 100 | + ---------- + 1 row in set (0.06 sec) the number of records here is 100. It is preliminarily proved that the contents of the two tables are consistent. However, this method has a BUG. In some cases, the result set cannot be consistent. For example, mysql> create table t1_old1 (id int); Query OK, 0 rows affected (0.27 sec) mysql> create table t1_new1 (id int); Query OK, 0 rows affected (0.09 sec) mysql> insert into t1_old1 values (1), (2), (3), (5); Query OK, 4 rows affected (0.15 sec) records: 4 Duplicates: 0 Warnings: 0 mysql> insert into t1_new1 values (2), (2), (3), (5); Query OK, 4 rows affected (0.02 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> s Elect * from t1_old1; + ------ + | id | + ------ + | 1 | 2 | 3 | 5 | + ------ + 4 rows in set (0.00 sec) mysql> select * from t1_new1; + ------ + | id | + ------ + | 2 | 2 | 3 | 5 | + ------ + 4 rows in set (0.00 sec) mysql> select count (*) from (select * from t1_old1 union select * from t1_new1) as T; + ---------- + | count (*) | + ---------- + | 4 | + ---------- + 1 row in set (0.00 sec) mysql> so at this point, this method is invalid.


Method 2: Use subtraction to return to zero.

Because MySQL does not provide the subtraction operator, we use PostgreSQL for detection. T_girl = # select count (*) from (select * from t1_old distinct T select * from t1_new) as T; count ------- 0 (1 row) Time: in 1.809 ms, the result is 0, which proves that the content of the two tables is consistent. Then we can detect another situation mentioned in the first method: t_girl = # select count (*) from (select * from t1_old1 distinct T select * from t1_new1) as T; count ------- 1 (1 row) Time: 9.837 MS

OK. If the result is not correct, an inconsistent conclusion is directly given.

Third: using full table JOIN is also the worst practice. Of course, I am referring to the situation where the number of table records is too large.

Of course, I also use PostgreSQL to demonstrate t_girl = # select count (*) from t1_old as a full outer join t1_new as B using (id, log_time) where. id is null or B. id is null; count ------- 0 (1 row) Time: 5.002 mst_girl = # The result is 0, and the content is consistent.


Type 4: Use checksum for verification.

For example, in MySQL, if the checksum values of the two tables are consistent, the content is consistent. Mysql> checksum table t1_old; + --------------- + ---------- + | Table | Checksum | + --------------- + ---------- + | t_girl.t1_old | 60614552 | + ----------------- + ---------- + 1 row in set (0.00 sec) mysql> checksum table t1_new; + --------------- + ---------- + | Table | Checksum | + --------------- + ---------- + | t_girl.t1_new | 60614552 | + ----------------- + ---------- + 1 row in set (0.00 sec) however, this method is only limited to the same two table structures. For example, if I modify the field type of t1_old In the table below, the checksum value will be different. Mysql> alter table t1_old modify id bigint; Query OK, 100 rows affected (0.23 sec) Records: 100 Duplicates: 0 Warnings: 0 mysql> checksum table t1_old; + --------------- + ------------ + | Table | Checksum | + ----------------- + ------------ + | average | 3211623989 | + ----------------- + ---------- + 1 row in set (0.00 sec) mysql> checksum table t1_new; + --------------- + ---------- + | Table | Checksum | + ----------------- + ---------- + | t_girl.t1_new | 60614552 | + --------------- + ---------- + 1 row in set (0.00 sec)



Therefore, from the methods provided by the above databases, it is relatively reliable to use subtraction to return to zero. Other methods are more suitable for detection under specific circumstances.


This article is from "god, let's see it !" Blog, please keep this source

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.