Concurrent transaction processing and concurrent Transaction Processing

Source: Internet
Author: User
Tags savepoint database sharding

Concurrent transaction processing and concurrent Transaction Processing

Transaction protection is a must for the software industry. Many financial companies simply fail to handle transactions.

As we all know, transactions have four major features: ACID. That is, atomicity, consistency, isolation, and durability.


Four features

Atomicity

A transaction is the logical unit of the database, and all the operations included in the transaction are executed or not executed;

Consistency

The job execution result must change the database from one state to another. Consistency is closely related to atomicity.

Isolation

The execution of a transaction cannot be disturbed by other transactions.

Durability

Once a transaction is committed, it should change the data in the database permanently.

This is the four features of transactions.


Isolation level

Next, let's talk about isolation.

We all know that the transaction control is too strict, and the program performance will be reduced in the case of concurrent access. Therefore, people always want to make concessions for the performance of transactions, so they are separated into four isolation levels:

For committed read, committed read, repeated read, and serialization.

However, due to different degree of restriction on the isolated sectors, dirty reads, non-repeated reads, and Phantom reads may occur.

1. dirty read: when a transaction is accessing data and modifying the data, the modification has not been committed to the database, another transaction also accesses the data and then uses the data.

2. Non-repeated read: Refers to reading the same data multiple times in a transaction. When the transaction is not completed, another transaction also accesses the same data. Therefore, the data read twice in the first transaction may be different because of the modification of the second transaction. In this way, the data read twice in a transaction is different, so it is called non-repeated read. For example, an editor reads the same document twice, but the author overwrites the document between the two reads. The document has been changed when the editor reads the document for the second time. The original reads cannot be repeated. This issue can be avoided if the editor can read the document only after the author has completed writing.

3. Phantom read: Refers to a phenomenon that occurs when a transaction is not executed independently. For example, the first transaction modifies the data in a table, which involves all data rows in the table. At the same time, the second transaction also modifies the data in this table. This modification inserts a new row of data into the table. In the future, the user who operates the first transaction will find that there are still data rows in the table that have not been modified, just like an illusion. For example, an editor changes the document submitted by the author, but when the production department merges the changes into the primary copy of the document, the author has added unedited new materials to this document. This issue can be avoided if no one can add new materials to the document before the editors and production departments process the original document.


The following describes the isolation level of database transactions, from low to high, which are uncommitted read, committed read, repeated read, and serialized.

These four levels can solve dirty reads, non-repeated reads, and Phantom reads one by one.

 

√: Possible occurrence ×: No

 

Dirty read

Non-repeated read

Phantom read

Read uncommitted

Read committed

×

Repeatable read

×

×

Serializable

×

×

×

 

Note: we will discuss isolation-level scenarios, mainly in the case of multiple transaction concurrency. Therefore, the next sections will focus on transaction concurrency.

Uncommitted read

When the company paid the salary, the leader sent 5000 yuan to the singo account, but the transaction was not submitted, while singo went to view the account and found that the salary had been paid, which was 5000 yuan, so I was very happy. Unfortunately, the lead found that the amount of salary sent to singo was incorrect. It was 2000 yuan, so he quickly rolled back the transaction. After modifying the amount, he committed the transaction, the actual salary of singo is only 2000 yuan, so singo is happy.

 

In the above situation, we are talking about dirty reads, two concurrent transactions, "transaction A: Lead pays for singo", and "transaction B: singo queries the wage account ", transaction B reads data not committed by transaction.

When the isolation level is set to Readuncommitted, dirty reads may occur. For how to avoid dirty reads, see the next isolation level.


Read submission

Singo uses a payroll card for consumption. The system reads 2000 yuan from the card, and her wife transfers the 2000 yuan from the singo payroll card to another account, A transaction was submitted before singo. When singo deducts money, the system Checked that singo's payroll card had no money and the fee deduction failed. singo wondered why the card had money ......

The above situation occurs, that is, what we call non-repeated reads, two concurrent transactions, "transaction A: singo consumption", and "transaction B: singo's wife online transfer ", transaction A reads the data in advance, and transaction B updates the data immediately and commits the transaction. When transaction A reads the data again, the data has changed.

When the isolation level is set to Readcommitted, dirty reads are avoided, but non-repeated reads may occur.

The default level of most databases is Readcommitted, such as SQL Server and Oracle. For how to solve the problem of non-repeated reading, see the next isolation level.


Repeated read

When the isolation level is set to Repeatableread, repeated read can be avoided. When singo uses a payroll card for consumption, once the system starts to read the payroll card information (that is, the transaction starts), singo's wife cannot modify the record, that is, singo's wife cannot transfer money at this time.

Although Repeatableread avoids repeated reads, Phantom reads may also occur.

Singo's wife works in the Banking Department. She often checks singo's credit card purchase records through the internal banking system. One day, she was checking that singo's total credit card consumption amount for the month (select sum (amount) from transaction where month = this month) was 80 yuan, while singo pays the bill at the cashier right after eating haicai outside, consuming 1000 yuan, that is, adding a 1000 yuan purchase record (insert transaction ...), after submitting the transaction, singo's wife printed the details of singo's credit card consumption for the current month on A4 paper, but found that the total consumption was 1080 yuan. singo's wife was surprised and thought that there was an illusion, phantom read is generated in this way.

Note: The default isolation level of Mysql is Repeatableread.


Serialization

Serializable is the highest level of transaction isolation, with the highest cost and low performance. It is rarely used. At this level, transaction execution can avoid dirty reads and non-repeated reads, it also avoids phantom reading.


PS: most databases use committed read as the default isolation level, such as Oracle and SqlServer. This method has good performance when the data volume is accessed, and prevents dirty reads. Although there are non-repeated reads, they are within the tolerable range; some data also adopts repeated reads as the default isolation level. If the default configuration is used, the performance of MySql is slightly lower. Default Configuration implementation at the MySql isolation level. The principle is that the Read and Write locks are applied during data access and the locks are applied for concurrent reads. However, only the first lock transaction can be used to modify the transaction, other transactions cannot be modified, which avoids repeated reads. Serialization is also a lock, but it is an exclusive lock. No matter which thread reads data, it will immediately occupy it until its operation is completed. This method is highly consistent, but has poor concurrency and is rarely used.


Propagation Characteristics of transactions

During development, we may call multiple services in one action. How can we ensure transactions in this case? The propagation feature of transactions. Let's take a look at the propagation features of Spring transactions:

1. PROPAGATION_REQUIRED: supports the current transaction. If no transaction exists, a new transaction is created. This is the most common choice.
2. PROPAGATION_SUPPORTS: supports the current transaction. If no transaction exists, it is executed in non-transaction mode.
3. PROPAGATION_MANDATORY: supports the current transaction. If no transaction exists, an exception is thrown.
4. PROPAGATION_REQUIRES_NEW: Creates a transaction. If a transaction exists, it is suspended.
5. PROPAGATION_NOT_SUPPORTED: executes the operation in non-transaction mode. If there is a transaction, the current transaction is suspended.
6. PROPAGATION_NEVER: runs in non-transaction mode. If a transaction exists, an exception is thrown.
7. PROPAGATION_NESTED: supports the current transaction and adds a Savepoint, which is committed or rolled back synchronously with the current transaction.


For more information, see:
1. PROPAGATION_REQUIRED: the transaction to be executed is not in another transaction, so a new transaction starts. For example, ServiceB. the transaction level of methodB is defined as PROPAGATION_REQUIRED. serviceA. methodA has started the transaction and calls ServiceB. methodB, ServiceB. methodB sees that it is already running in ServiceA. within a methodA transaction, there will be no new transactions. However, if ServiceA. methodA finds that he is not in the transaction, it will assign a transaction to him. In this way, the transaction will be rolled back if an exception occurs in ServiceA. methodA or anywhere in ServiceB. methodB. Even if the ServiceB. methodB transaction has been committed, ServiceA. methodA will roll back in the next fail, and ServiceB. methodB will also roll back.
2. PROPAGATION_SUPPORTS: if the current transaction is running in the form of a transaction, and if it is no longer in the current transaction, it will run in the form of a non-transaction.
3. PROPAGATION_MANDATORY: must be executed in a transaction. That is, it can only be called by one parent transaction. Otherwise, an exception is thrown.
4. PROPAGATION_REQUIRES_NEW: This is a detour. For example, we designed ServiceA. the transaction level of methodA is PROPAGATION_REQUIRED, ServiceB. the transaction level of methodB is PROPAGATION_REQUIRES_NEW. When it is executed to ServiceB. in methodB, ServiceA. the transaction where methodA is located will be suspended, ServiceB. methodB starts a new transaction and waits for ServiceB. after the methodB transaction is completed, he continues to execute. The difference between the PROPAGATION_REQUIRED transaction and PROPAGATION_REQUIRED is the transaction rollback degree. Because ServiceB. methodB is a new transaction, there are two different transactions. If ServiceB. methodB has been submitted, ServiceA. methodA fails to roll back, And ServiceB. methodB will not roll back. If ServiceB. methodB fails to roll back, if the exception it throws is caught by ServiceA. methodA, The ServiceA. methodA transaction may still be committed.
5. PROPAGATION_NOT_SUPPORTED: transactions are currently not supported. For example, ServiceA. the transaction level of methodA is PROPAGATION_REQUIRED, and ServiceB. the transaction level of methodB is PROPAGATION_NOT_SUPPORTED. in methodB, ServiceA. the transaction of methodA is suspended, but it is finished in a non-transaction state, and then continues ServiceA. methodA transactions.
6. PROPAGATION_NEVER: cannot be run in a transaction. Assume that the transaction level of ServiceA. methodA is PROPAGATION_REQUIRED, and that of ServiceB. methodB is PROPAGATION_NEVER, then ServiceB. methodB throws an exception.
7. PROPAGATION_NEST: the key to understanding Nested is savepoint.
The difference between him and PROPAGATION_REQUIRES_NEW is that PROPAGATION_REQUIRES_NEW starts another transaction and will be independent of its parent transaction, while the Nested transaction and its parent transaction are dependent on each other, his submission must be done with his parent transaction. That is to say, if the parent transaction is finally rolled back, it will also be rolled back. The advantage of the Nested transaction is that it has a savepoint. That is, ServiceB. if methodB fails to roll back, ServiceA. methodA also rolls back to the savepoint, ServiceA. methodA can select another branch, such as ServiceC. methodC, continue to execute, to try to complete your own transactions. However, this transaction is not defined in the EJB standard.


PS: our most common propagation feature is PROPAGATION_REQUIRED. Supports the current transaction. If no transaction exists, a new transaction is created.


Distributed transactions

1. XA

XA is a distributed transaction specification proposed by X/Open organizations. XA specifications mainly define interfaces between (global) Transaction Manager and (local) Resource Manager. XA interfaces are two-way system interfaces that form a communication bridge between the Transaction Manager and one or more Resource managers.XA needs to introduce the transaction manager because, in the distributed system, theoretically, the two machines cannot reach the same state in theory, and a single point of coordination needs to be introduced.The transaction manager controls global transactions, manages transaction lifecycles, and coordinates resources. The resource manager is responsible for controlling and managing actual resources (such as databases or JMS queues ).

2. JTA

JTA (java Transaction API) also defines support for XA transactions on the Java platform. In fact, JTA is modeled based on the XA architecture. In JTA, the transaction manager is abstracted as javax. transaction. the TransactionManager interface is implemented through the underlying transaction service (JTS. Like many other java specifications, JTA only defines interfaces. The specific implementation is provided by the supplier (such as the J2EE manufacturer). Currently, the implementation of JTA mainly consists of the following types:

1. JBoss implementation provided by the J2EE container)
2. Independent JTA implementations, such as JOTM and Atomikos. These implementations can be used to provide distributed transaction guarantee in environments that do not use J2EE application servers. Such as Tomcat, Jetty, and common java applications.


3. Two-Phase submission

In all the introductions on distributed transactions, two-phase commit is inevitable, because it is the key to implementing XA distributed transactions (specifically, two-phase commit mainly guarantees the atomicity of distributed transactions: that is, all nodes are either completely or completely ). The so-called two phases refer to the preparation and submission phases.



1. preparation phase: the Transaction Coordinator (Transaction Manager) sends a Prepare message to each participant (Resource Manager). Each participant either returns a failure (such as a permission verification failure) Directly or executes a transaction locally, write local redo and undo logs, but do not submit, to a state of "everything is ready, only fail. (I have not yet referred to specific materials for each participant in the preparation stage, but I am very sure that the participants have completed almost all formal submissions during the preparation stage, some materials are "tentative submission", and only the formal submission of the last step is retained for the second stage .)

2. submission phase: If the Coordinator receives the failed message or times out from the participant, it directly sends a Rollback message to each participant; otherwise, it sends a Commit message; the participant executes the commit or rollback operation according to the instruction of the Coordinator to release the lock resources used during all transaction processing. (Note: The lock resource must be released at the final stage)

Summary:
The objective of dividing the commit into two phases is to commit the transaction as late as possible so that the transaction can complete all the work that can be done as much as possible before the commit, the final commit phase will be a very short and minor operation. The probability of failure of such operation in a distributed system is very small, that is, the so-called "Critical Period of network communication" is very short, which is the key to ensuring the atomicity of distributed transactions during the two-phase commit. (The only theoretically two-phase commit problem occurs when the coordinator issues a commit command and the host encounters a disk fault and other permanent errors, leading to transaction unavailability and recovery)


From the perspective of the two-phase commit method, it is clear that the transaction commit process must be coordinated among multiple nodes, the release of lock resources on each node must wait until the transaction is finally committed. In this way, the two-phase commit consumes more time to execute the same transaction than the one-phase commit. The extended transaction execution time increases the probability of conflicting lock resources. When the transaction concurrency reaches a certain number, a large number of transactions may backlog or even cause deadlocks, the system performance will seriously decline. This is the use of XA transactions

4. One-stage commit (Best Efforts 1PC Mode)

Unlike two-stage commit, one-stage commit is very straightforward, that is, the process of sending a request to the database from the application to returning the result to the application after the database is submitted or rolled back. The one-stage commit does not require the "coordinator" role, and there is no coordination operation between nodes. Therefore, the transaction execution time is shorter than that of the two-stage commit, however, the "dangerous period" of commit is the actual commit time of each transaction. Compared with the two-stage commit, the probability that the one-stage commit appears in the "inconsistent" state increases. However, we must note that "inconsistency" may occur only when the infrastructure is faulty (such as network interruptions and hosts, compared with its performance advantages, many teams will choose this solution. There is a very good article on how to implement a phase commit in the spring environment, it is worth reference: http://www.javaworld.com/javaworld/jw-01-2009/jw-01-spring-transactions.html? Page = 5

5. Transaction Compensation Mechanism

Like the best efforts 1PC mode, the premise is that the application can obtain all the data sources, and then use the same transaction Manager (spring transaction manager here) to manage transactions. The most typical application scenario of this mode is non-database sharding. However, for the autonomy distributed system interfaces built based on web service, rpc, and jms, the best efforts 1PC mode is powerless. In such scenarios, the last method can help us achieve "final consistency", that is, the transaction compensation mechanism. The transaction compensation mechanism is a big topic. This article only mentions it briefly and will make special research and introduction in the future.

6. How to choose between two-phase commit standard distributed transactions and the Best Efforts 1PC?

Generally, the number of subsystems requiring interaction is small, and the entire system will not or will rarely introduce new subsystems in the future, and the load will remain stable for a long time, that is, there is no scaling requirement, considering the development complexity and workload, you can choose to use distributed transactions. For systems with low time requirements and high performance requirements, use the Best Efforts 1PC or transaction compensation mechanism should be considered. Distributed transactions should not be considered for systems that require sharding transformation, because sharding opens the window for horizontal database scaling, it seems that the use of distributed transactions is a new window opened with another constraint.

To sum up

Transaction Control is a required step in programming. We often do not pay attention to transactions when writing programs. This is because transactions are generally divided into two methods: programmatic transactions and declarative transactions. Although programming transactions are flexible, you need to manually write JDBC template-type code to control transactions, so we do not often use them. We often use declarative transactions, it is completely configuration-based to get into the program in the way of AOP, which is not reflected in the Code, so we will not see the transaction code.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.