Research on Distributed transactions, two-phase commit, one-phase commit, best efforts 1pc mode and transaction Compensation Mechanism

Last Update:2018-12-05 Source: Internet

Author: User

Tags database sharding

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The original Article connection: http://blog.csdn.net/bluishglc/article/details/7612811, reproduced please indicate the source!

1. XA

XA is a distributed transaction specification proposed by X/Open organizations. XA specifications mainly define interfaces between (global) Transaction Manager and (local) Resource Manager. XA interfaces are two-way system interfaces that form a communication bridge between the transaction manager and one or more resource managers. XA needs to introduce the transaction manager because, in a distributed system, theoretically (referring to papers such as Fischer), the two machines cannot reach the same state in theory, A single point must be introduced for coordination. The transaction manager controls global transactions, manages transaction lifecycles, and coordinates resources. The resource manager is responsible for controlling and managing actual resources (such as databases or JMS queues ). Describes the relationship between the Transaction Manager, resource manager, and applications:

Figure 1. Relationships between various participants in distributed transactions under XA specifications

2. JTA

As a Java transaction specification, JTA (Java transaction API) also defines support for XA transactions. In fact, JTA is modeled based on the Xa architecture. In JTA, the transaction manager is abstracted as javax. transaction. the transactionmanager interface is implemented through the underlying transaction service (JTs. Like many other Java specifications, JTA only defines interfaces. The specific implementation is provided by the supplier (such as the J2EE manufacturer). Currently, the implementation of JTA mainly consists of the following types:

1. JBoss implementation provided by the J2EE container)
2. Independent JTA implementations, such as jotm and atomikos. These implementations can be used to provide distributed transaction guarantee in environments that do not use J2EE application servers. Such as Tomcat, Jetty, and common Java applications.

3. Two-Phase submission

In all the introductions on distributed transactions, two-phase commit is inevitable, because it is the key to implementing XA distributed transactions (specifically, two-phase commit mainly guarantees the atomicity of distributed transactions: that is, all nodes are either completely or completely ). The so-called two phases refer to the preparation and submission phases.

Figure 2. Two-phase commit (from the article Java transaction Design Strategy published by info)

1. preparation phase: The Transaction Coordinator (Transaction Manager) sends a prepare message to each participant (Resource Manager). Each participant either directly returns a failure (such as a permission verification failure) or executes a transaction locally, write local redo and undo logs, but do not submit, to a "everything is ready, only fail" status. (I have not yet referenced the exact information about what each participant has done in the preparation phase, but I am very sure that the participants have completed almost all the formal submissions during the preparation phase, some materials are "tentative submission", and only the formal submission of the last step is retained for the second stage .)

2. Submission phase: If the Coordinator receives a failed message or times out, it will send a rollback message to each participant. Otherwise, it will send a commit message; the participant executes the commit or rollback operation according to the instruction of the Coordinator to release the lock resources used during all transaction processing. (Note: The lock resource must be released at the final stage)

The objective of dividing the commit into two phases is to commit the transaction as late as possible so that the transaction can complete all the work that can be done as much as possible before the commit, the final commit phase will be a very short and small operation, which has a very low probability of failure in a distributed system, that is, the so-called "Critical Period of network communication" is very short, which is the key to ensuring the atomicity of distributed transactions during the two-phase commit. (The only theoretically two-phase commit problem occurs when the coordinator issues a submit command and the host encounters a disk fault and other permanent errors, leading to transaction unavailability and recovery)

From the perspective of the two-phase commit method, it is clear that the transaction commit process must be coordinated among multiple nodes, the release of lock resources on each node must wait until the transaction is finally committed. In this way, the two-phase commit consumes more time to execute the same transaction than the one-phase commit. The extended transaction execution time increases the probability of lock resource conflicts. When the transaction concurrency reaches a certain number, a large number of transaction backlog or even deadlocks may occur, the system performance will seriously decline. This is the use of Xa transactions

4. One-stage commit (best efforts 1pc Mode)

Unlike two-stage commit, one-stage commit is very straightforward, that is, the process of sending a request to the database from the application to returning the result to the application after the database is submitted or rolled back. The one-stage commit does not require the "coordinator" role and there is no coordination operation between nodes. Therefore, the transaction execution time is shorter than that of the two-stage commit, however, the "dangerous period" of commit is the actual commit time of each transaction. Compared with the two-phase commit, the probability of one-phase commit appearing in the "inconsistent" state increases. However, we must note that "inconsistency" may occur only when the infrastructure is faulty (such as network interruptions and hosts, compared with its performance advantages, many teams will choose this solution. There is a very good article on how to implement a phase commit in the spring environment, it is worth reference: http://www.javaworld.com/javaworld/jw-01-2009/jw-01-spring-transactions.html? Page = 5

5. Transaction Compensation Mechanism

Like the best efforts 1pc mode, the premise is that the application can obtain all the data sources, and then use the same transaction Manager (Spring transaction manager here) to manage transactions. The most typical application scenario of this mode is non-database sharding. However, for the autonomy distributed system interfaces built based on Web Service, RPC, and JMS, the best efforts 1pc mode is powerless. In such scenarios, the last method can help us achieve "final consistency", that is, the transaction compensation mechanism. The transaction compensation mechanism is a big topic. This article only mentions it briefly and will make special research and introduction in the future.

6. How to choose between two-phase commit standard distributed transactions and the best efforts 1pc?

Generally, the number of subsystems requiring interaction is small, and the entire system will not or will rarely introduce new subsystems in the future, and the load will remain stable for a long time, that is, there is no scaling requirement, considering the development complexity and workload, you can choose to use distributed transactions. For systems with low time requirements and high performance requirements, use the best efforts 1pc or transaction compensation mechanism should be considered. Distributed transactions should not be considered for systems that require sharding transformation, because sharding opens the window for horizontal database scaling, the use of distributed transactions seems to be a new window that has been opened.

Supplement: critical periods of network communication

As network communication faults may occur at any time, any program that sends a request and waits for a response may lose contact. This risk occurs after a request is sent, and before the server returns a response, if network communication fails during this period, the requesting party cannot receive a response, therefore, it is impossible to determine whether the server has successfully processed the request. If the server fails to receive the response, the request may fail to be sent to the server. This period is called the dangerous period of network communication (in-doubt time ). Obviously, the critical period of network communication is another reliability issue that needs to be considered in addition to single-point reliability in distributed systems.

References:

1. Baidu encyclopedia
2. http://en.wikipedia.org/wiki/Java_Transaction_API
3. http://www.nosqlnotes.net/archives/62#more-62
4. http://hi.baidu.com/javaopensource/blog/item/0a2b764ec501b10cb3de05ba.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More