Cross-Database Transaction consistency solution (for example)

Source: Internet
Author: User

Let's look at the cross-Database Transaction consistency problem. This is a simple scenario: There are two new and old systems, corresponding to the new and old databases, and the new database adopts the database/table sharding design, considering the potential risks after the project is released, a parallel solution of the new and old systems is adopted. The business of this system is relatively simple: it receives external data and then checks and processes the data. To ensure that the new and old systems can run in parallel, the dual-write scheme must be implemented when receiving data, resulting in cross-Database Transaction consistency.

The following figure shows this simple scenario.

There may be a small problem, that is, the old database may be successfully written, but the new database may fail to be written.

We assume that the probability is one in one thousandth, which may be higher when the system is released. From the current data volume, we can see that there are about 5000 million data records a day, so there are 500 inconsistent data records. Considering that this is a data accounting system, there cannot be a loss, otherwise the comparison results may be inconsistent between the two sides. Therefore, consistency must be ensured.

There are several solutions to this problem:

1. Consider using JTA and other transaction managers that support distributed transactions.

The advantage of this solution is that there is a ready-made solution. Generally, J2EE servers provide implementation related to JTA. The obvious problem is that the solution is too heavyweight. Generally, in addition to servers, JTA also provides commercial support for corresponding database service providers, mainly based on the xaresource JDBC driver, which provides commercial support, payment is required. In addition, the Xa database driver may cause some potential problems, especially when it is based on different database vendors. XA is based on the two-phase commit protocol. In order to complete a transaction, the transaction manager needs to communicate with the database multiple times, which is less efficient.

2. Consider using the database's own data synchronization mechanism

If the structure of the new and old databases is basically the same, this solution is more reliable. It is also a relatively simple solution. The limitations of this solution are also repeated. In this project, the new database is not a physical database, but multiple physical databases, and the old database is a physical database. If you want to use the database's own synchronization mechanism, it involves data replication between multiple databases and one Database. At the same time, because of the different table sharding schemes, a ing configuration is made on both sides, which needs to be performed at the database level. The logic is quite complicated and the solution cost is relatively high. It is equivalent to re-implementing an important database/table sharding logic at the database layer.

In fact, this also brings about a maintenance problem. Once we feel that the new system is stable enough. Applications can switch between writing and receiving databases, and cut down the logic of the old database to meet the requirement of writing only new databases. The whole process does not need to be re-released. The database scheme needs to stop the script and configure it in multiple places.

3. Put the same two model tables in the old stock, One table is used for the persistence table of the old database, and the other is used as a temporary table, mainly as the data to be synchronized to the new database. If the database has been synchronized to the new database, delete it. If it is not synchronized to the new database, it will be synchronized to the new database. This process uses a timing mechanism to regularly extract a certain amount of data from the temporary table every minute and import the data to the new database in batches. Retry to ensure consistency. The new database requires idempotence to ensure that the data is only synchronized once. Generally, it is identified by the data feature identifier, which is generally the unique primary key of the data.

The following is a simple implementation:

The main idea of these three solutions is to adopt a Retry Mechanism, which is only a model in distributed transactions, and there are also two-phase commit and exception recovery compensation mechanisms.

Cross-Database Transaction consistency solution (for example)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.