Distributed Transaction (i) two-phase commit and JTA

Source: Internet
Author: User
Tags postgresql prepare rollback
Distributed transactions, like local transactions, contain atomicity (atomicity), consistency (consistency), isolation (isolation), and persistence (durability). Two-stage commit is an important method to ensure atomicity in distributed transaction. This paper focuses on the principle of two-phase submission, the two-stage submission interface in PostgreSQL, and the usage of JTA in Java in two-stage submission interface specification.

Original articles, reproduced please be sure to put the following paragraph at the beginning of the article (keep the hyperlink).
This article is forwarded from the technical World , the original source link http://www.jasongj.com/big_data/two_phase_commit/ distributed transaction Distributed Transaction Introduction

A distributed transaction is a transaction that involves operating multiple databases (or a system that provides transactional semantics, such as JMS). In fact, the concept of the same database transaction is extended to multiple database transactions. The purpose is to ensure the atomicity of transaction operations in distributed systems. The key to distributed transaction processing is that there must be a way to know all the actions that a transaction does anywhere, and the decision to commit or roll back the transaction must produce a uniform result (commit or rollback all). distributed transaction Implementation Mechanism

As the author of the Essence of SQL Optimization (vi) MVCC PostgreSQL implementation Transaction and multiple versioning concurrency control, transactions contain atomicity (atomicity), consistency (consistency), Isolation (isolation) and persistence (durability).

PostgreSQL for ACID implementation technology is shown in the following table.

ACID Implementation Technology
Atomic Sex (atomicity) MVCC
Consistency (consistency) Constraints (primary key, foreign key, etc.)
Isolation of MVCC
Persistence of WAL

The implementation techniques for distributed transactions are shown in the following table. (Take PostgreSQL as an example of a transaction party)

Distributed Acid Implementation Technology
Atomic Sex (atomicity) MVCC + Two-stage submission
Consistency (consistency) Constraints (primary key, foreign key, etc.)
Isolation of MVCC
Persistence of WAL

As you can see from the table above, consistency, isolation and persistence depend on the original mechanism of each distributed transaction participant, while two-stage commit mainly guarantees the atomicity of distributed transaction. Two phase submission How distributed transactions guarantee Atomicity

In a distributed system, each node (or transaction participant) is physically independent from each other and coordinated through the network. Each independent node (or component) can guarantee the ACID properties of its data operation due to the existence of a transaction mechanism. However, it is difficult to guarantee acid, especially atomicity, between nodes because they are independent and cannot know exactly the execution of their transactions through nodes.

If you want to implement the atomicity of a distributed system, you must ensure that all nodes have data writes, either all of them (in effect) or none of them (effective). However, a node cannot know the execution result of the local transaction of another machine while executing a local transaction, so it does not know whether the transaction should be commit or roolback. The general solution is to introduce a "coordinator" component to unify the execution of all distributed nodes. XA specification

XA is the specification of distributed transactions proposed by the X/open organization. The XA specification primarily defines interfaces between the (global) transaction manager (Transaction Manager) and (local) resource Manager (Resource Manager). The XA interface is a two-way system interface that forms a communication bridge between the transaction manager (Transaction Manager) and one or more resource managers (Resource Manager). The transaction manager introduced by XA acts as the "coordinator" role in the global transaction described above. The transaction manager controls the global transaction, manages the transaction lifecycle, and coordinates resources. The resource manager is responsible for controlling and managing actual resources, such as databases or JMS queues. Currently, all major databases such as Oracle, Informix, DB2, Sybase, and PostgreSQL provide XA support.

In the XA specification, the transaction manager manages the resource manager primarily through the following interfaces Xa_open,xa_close: Establish and close the connection to the resource manager. Xa_start,xa_end: Start and end a local transaction. Xa_prepare,xa_commit,xa_rollback: Pre-commit, commit, and roll back a local transaction. Xa_recover: Rollback of a transaction that has been committed in advance. Two phase submission principle

The two-phase submission algorithm can be summed up as follows: The facilitator asks the participants if they are ready to submit, and decides to send a commit or rollback instruction to all participants based on feedback from all participants (the coordinator sends the same instruction to all participants).

The so-called two stages refer to the stage of preparation, which is also called the voting phase. At this stage, the coordinator asks all participants if they are ready to submit, and the participants reply Prepared if they are ready to submit, otherwise non-prepared. The submission phase is also known as the implementation phase. If the coordinator receives a prepared from all participants in the previous stage, a commit instruction is sent to all participants at this stage, and all participants perform a commit immediately, otherwise the coordinator sends the rollback instruction to all participants and the participant performs the rollback operation immediately.

In a two-phase commit, the interaction between the Coordinator and the participant is shown in the following illustration.
two-phase commit prerequisite network communication is trustworthy. Although the network is unreliable, the main objective of the two-stage submission is not to address the problem of networking, such as Byzantine issues. The major network communication critical period (IN-DOUBT time) submitted by both phases is in the transaction submission phase, which is very short. All crash nodes will eventually be restored and will not remain in the crash state. Each distributed transaction participant has a Wal log, and the log is stored on a stable storage. The local transaction state on each node can be recovered from the Wal log even if it touches the machine crash. two-phase commit fault-tolerant mode

The anomalies in two-stage submissions are mainly divided into the following three cases, the Coordinator is normal, the participant crash the coordinator crash, and the participants ' normal coordinators and parties crash

In the first case, if the participant crash at the preparation stage, the coordinator does not receive a prepared reply and the Coordinator does not send a commit and the transaction is not actually submitted. If the party submits it at the submission stage, it can, if it recovers, obtain the transaction from the other party or the Coordinator and respond accordingly.

The second situation could be resolved by electing a new coordinator.

In the third case, the two-stage submission is not a perfect solution. In particular, when the coordinator sends out a commit, the only participant who receives the commit is crash, at which point the other participants are unable to understand the transaction submission status from the coordinator and the crash participants. However, as described in the two-stage submission prerequisites for the previous section, one of the prerequisites for a two-phase commit is that all crash nodes will eventually recover, so that when the participant of the commit is restored, the other node can obtain the transaction state from it and make the appropriate action. JTA JTA Introduction

The transaction specification JTA (Java Transaction API) on the Java platform also defines support for XA transactions, in fact, JTA is modeled on an XA architecture. In JTA, the transaction manager is abstracted as an Javax.transaction.TransactionManager interface and is implemented through the underlying transaction service (that is, the Java transaction Services). Like many other Java specifications, JTA only defines the interface, the implementation of which is provided by vendors (such as Java EE vendors), the implementation of the current JTA is mainly the following: the Java-EE container provided by the JTA implementation (such as JBoss). Standalone JTA implementations: such as JOTM (Java Open Transaction Manager), Atomikos. These implementations can be applied in environments that do not use the Java EE Application Server to provide distributed transaction assurance. PostgreSQL Two phase submission interface PREPARE TRANSACTION transaction_id PREPARE TRANSACTION prepares two-phase commit for the current transaction. After the command, the transaction is no longer associated with the current session; its state is completely saved on disk, and it has a very high likelihood of committing success, even if the database crashed before the request was submitted. This command must be used in a transaction block that begins explicitly with begin. Commit PREPARED transaction_id commits a transaction with the ID of transaction_id that has entered the preparation phase ROLLBACK PREPARED transaction_id Rollback a transaction with an ID of transaction_id that has entered the preparation phase

Typical use is as follows

1 2 3 4 5 6 7 8 9 each
postgres=> BEGIN; BEGIN postgres=> CREATE TABLE Demo (a TEXT, b INTEGER); CREATE TABLE postgres=> PREPARE TRANSACTION ' the prepared TRANSACTION '; PREPARE TRANSACTION postgres=> SELECT * from Pg_prepared_xacts; Transaction | GID | Prepared | Owner | Database-------------+--------------------------------+-------------------------------+-------+----------23970 | The Prepared transaction | 2016-08-01 20:44:55.816267+08 | casp | Postgres (1 row)

As you can see from the code above, after using the prepare TRANSACTION transaction_id statement, PostgreSQL will pg_catalog.pg_prepared_xact the transaction in the Transaction_ table The ID is in the GID field, and the local transaction ID of the transaction, that is, 23970, is stored in the transaction field, while the creation time of the transaction is recorded and the user and database name is created.

Continue with the following command

1 2 3 4 5 6 7 8 9
postgres=> \q SELECT * from Pg_prepared_xacts; Transaction | GID | Prepared | Owner | Database-------------+--------------------------------+-------------------------------+-------+----------23970 | The Prepared transaction | 2016-08-01 20:44:55.816267+08 | casp | CQDB (1 row) cqdb=> ROLLBACK PREPARED ' The PREPARED ' ROLLBACK PREPARED cqdb=> SELECT * from Pg_prepared_xacts; Transaction | GID | Prepared | Owner | Database-------------+-----+----------+-------+----------(0 rows)

Even if you exit the current Session,pg_catalog.pg_prepared_xact table, the transaction information that is already in the ready phase is still present, which is consistent with the persistence of the transaction information stored on disk by the nodes after the preparation phase described above. Note: If you do not use prepared TRANSACTION ' transaction_id ', transactions that have been begin but have not been commit or rollback are automatically rollback when the session exits.

When rollback a transaction that has entered the staging phase, its transaction_id must be specified. PostgreSQL Two phase submission considerations PREPARE the TRANSACTION transaction_id command, the transaction state is completely saved on disk. After the PREPARE TRANSACTION transaction_id command, the transaction is no longer associated with the current session, so the current sessions can continue to perform other transactions. Commit prepared and rollback prepared can be executed in any session without requiring execution in a prepared session. Prepare are not allowed for transactions that have been executed that involve temporary tables or that are created with a with hold cursor. These attributes are so tightly bound to the current session that nothing is available in a prepared transaction. If a transaction modifies Run-time parameters with set, these effects are retained after prepare transaction, and are not affected by any subsequent commit prepared or rollback prepared, because the scope of the set is the current session. From a performance standpoint, it is unwise to put a transaction in a prepared state for a long time because it affects the ability of vacuum to reclaim storage. Prepared transactions continue to hold the locks they acquire until the transaction is commit or rollback. So if a transaction that has entered the preparatory phase has not been processed, other transactions may be block or fail because the lock was not acquired. By default, PostgreSQL does not open a two-phase commit, and you can open PostgreSQL two-phase commit by setting the Max_prepared_transactions configuration entry in the Postgresql.conf file. JTA Implementation PostgreSQL two phase commit

This paper uses the JTA implementation provided by Atomikos to implement the distributed transaction by using the two-phase commit feature provided by PostgreSQL. The distributed transaction in this article uses a PostgreSQL instance on 2 different machines.

The code shown in this example can be obtained from the author GitHub.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26-27--28 29---30 31--32 33 34 35 36 37 38-39 40 41 42 45 46

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.