A full understanding of MySql transactions and MySql transactions
I have been working on Order projects recently and used transactions. MySql is used for our database, and innoDB is used for the storage engine. innoDB has good support for transactions. In this article, we will take a look at transaction-related knowledge.
Why is there a transaction?
Transactions are widely used in order systems, banking systems, and other scenarios. In the following scenario, user A and user B are bank depositors. Now A wants to transfer 500 yuan to B. Do the following:
1. Check A's account balance> 500 RMB;
2. Account A deducts RMB 500;
3. An increase of RMB 500 for B's account;
After the normal process is completed, account A deducts 500 and Account B adds 500. What if the system goes down after A deducts money from account? A lost 500 in vain, and B did not receive the 500 that should have belonged to him. In the above case, there is A hidden precondition: A deducts money and B adds money, either succeeds at the same time or fails at the same time. This is the requirement of transactions.
What is a transaction?
Instead of defining a transaction, let's talk about the features of the transaction. As we all know, transactions must meet the four features of ACID.
1. A (atomicity) atomicity. The execution of a transaction is considered as an inseparable minimum unit. The operations in the transaction can be either all successfully executed or all failed rollback operations. You cannot perform only some of them.
2. C (consistency) consistency. The execution of a transaction should not undermine the integrity constraints of the database. If the system crashes after the execution of the 2nd operations in the preceding example, the total money of A and B will not change.
3. I (isolation) isolation. Generally, transactions do not affect each other. However, in practice, the degree of mutual transaction impact is affected by the isolation level. The article will be detailed later.
4. D (durability) persistence. After the transaction is committed, the committed transaction must be persisted to the disk. Even if the system crashes, the submitted data should not be lost.
Four isolation levels of transactions
As mentioned earlier, the isolation of transactions is affected by the isolation level. What is the transaction isolation level? The isolation level of transactions can be considered as the "selfish" level of transactions, which defines the visibility between transactions. Isolation levels are divided into the following types:
1. read uncommitted (uncommitted read ). At the isolation level of RU, transaction A's modifications to the data are visible to transaction B even if they are not committed. This problem is called dirty read. This is an isolation level with a low degree of isolation, which may cause many problems in actual use, so it is generally not used.
2. read committed (submit READ ). At the RC isolation level, there will be no dirty read problems. Transaction A makes changes to the data. After committing, it will be visible to transaction B. For example, transaction B reads data 1 when it is started, and then transaction A starts, changes the data to 2, and commits, B reads the data again and reads the latest data 2. At the RC isolation level, the problem of non-repeated reading may occur. This isolation level is the default isolation level for many databases.
3. repeatable read (repeatable read ). At the RR isolation level, there will be no non-repeated read issues. Transaction A's modifications to the data are invisible to transactions started before transaction A after the commit. For example, transaction B reads data 1 when it is started, and then transaction A starts, changes the data to 2, commits, and B reads the data again, but still can only read 1. Phantom reading may occur at the RR isolation level. Phantom read means that when a transaction reads a value in a certain range, another transaction inserts a new record in this range, then the previous transaction reads the value in this range again and reads the newly inserted data. The default isolation level of Mysql is RR. However, the innoDB Engine gap lock of mysql successfully solves the phantom read problem.
4. SERIALIZABLE (SERIALIZABLE ). Serializable is the highest isolation level. This isolation level requires serial execution of all things. At this isolation level, each row of data read is locked, leading to a large number of lock requisition problems, with the worst performance.
Here is an example to help you understand the four isolation levels. 1. Transaction A and transaction B are enabled successively, and data 1 is updated multiple times. When the four little people start transactions at different times, what values of data 1 may be seen?
Figure 1
The first villain may read any one between 1 and 20. Because at the isolation level of uncommitted reads, the modifications made by other transactions to the data are also visible to the current transaction. The second villain may read and 20, and he can only read data that has been committed by other transactions. The data read by the third villain depends on the start time of the transaction. What is read when the transaction is started, then what is read before the transaction is committed. The fourth villain can only read data between A end and B start, but cannot read data during the execution of transactions A and B. Because the fourth villain needs to lock the data to read, during the execution of transactions A and B, it will occupy the data write lock, causing the fourth villain to wait for the lock.
Figure 2 lists the problems faced by different isolation levels.
Figure 2
Obviously, the higher the isolation level, the larger the resource consumption (LOCK) it brings, and the lower its concurrency performance. To be accurate, there is no concurrency at the serializable isolation level.
Figure 3
Transactions in MySql
Transaction implementation is a database-based storage engine. Different storage engines have different levels of support for transactions. Mysql supports transaction storage engines including innoDB and NDB. InnoDB is the default storage engine of mysql. The default isolation level is RR, and it goes further under the RR isolation level.Concurrency Control(MVCC, Multiversion Concurrency Control) solves the problem of non-repeated read, and adds a gap lock (that is, Concurrency Control) to solve the phantom read problem. Therefore, the RR isolation level of innoDB actually achieves the serialization level effect, and retains the good concurrency performance.
The isolation of transactions is achieved through locks, while the atomicity, consistency, and persistence of transactions are achieved through transaction logs. When talking about transaction logs, We Have To Say redo and undo.
1. redo log
In the innoDB Storage engine, transaction logs are implemented by redo logs and innoDB Log Buffer. When a transaction is started, operations in the transaction are first written to the log buffer of the storage engine. before the transaction is committed, these buffered logs must be refreshed to the disk for persistence in advance, this is what DBAs often call "Log First" (Write-Ahead Logging ). After the transaction is committed, the data files mapped in the Buffer Pool are slowly refreshed to the disk. In this case, if the database crashes or goes down, when the system is restarted for recovery, the database can be restored to a previous state based on the logs recorded in the redo log. Incomplete transactions can be committed or rolled back, depending on the recovery policy.
When the system starts, it has allocated a continuous storage space for the redo log, recording the Redo Log in the order of appending, and improving performance through sequential IO. All transactions share the storage space of the redo log. Their Redo logs are recorded in an alternate order of statement execution. The following is a simple example:
Record 1: <trx1, insert...>
Record 2: <trx2, delete...>
Record 3: <trx3, update...>
Record 4: <trx1, update...>
Record 5: <trx3, insert...>
2. undo log
The undo log is mainly used for transaction rollback. In the transaction execution process, in addition to redo logs, a certain amount of undo logs are also recorded. The undo log records the status of data before each operation. If the transaction needs to be rolled back during execution, you can perform the rollback operation based on the undo log. The rollback of a single transaction only rolls back the operations performed by the current transaction, and does not affect the operations performed by other transactions.
The following is a simplified process of undo + redo transactions:
Assume there are two values, A and B, respectively. The values are 1, 2.
1. start transaction;
2. Record A = 1 to undo log;
3. update A = 3;
4. Record A = 3 to redo log;
5. Record B = 2 to undo log;
6. update B = 4;
7. Record B = 4 to redo log;
8. Refresh the redo log to the disk.
9. commit
At any step from 1 to 8, the system goes down and the transaction is not committed, so the transaction will not affect the data on the disk. If the system goes down between 8 and 9, you can choose to roll back after recovery or continue to commit the transaction because the redo log is persistent at this time. If the system goes down after 9, and the data changed in the memory ing cannot be flushed back to the disk, the data can be flushed back to the disk based on the redo log after the system recovers.
Therefore, the redo log actually guarantees the transaction persistence and consistency, while the undo log ensures the atomicity of the transaction.
Distributed transactions
There are many ways to implement distributed transactions. You can either use native transaction support provided by innoDB or use message queues to achieve eventual consistency of distributed transactions. Here we will mainly talk about innoDB's support for distributed transactions.
Mysql distributed transaction model. The model consists of three parts: Application (AP), Resource Manager (RM), and Transaction Manager (TM ).
The application defines the transaction boundary and specifies the transactions to be performed;
Resource Manager provides a way to access transactions. Generally, a database is a resource manager;
The transaction Manager coordinates various transactions in the global transaction.
Distributed transactions use the two-phase commit method. Prepare all transaction nodes in the first stage and notify the Transaction Manager ready. The second-stage Transaction Manager tells each node whether it is commit or rollback. If one node fails, all global nodes must be rollback to guarantee the atomicity of transactions.
Summary
When do I need to use transactions? I think transaction support is required for ACID scenarios in the business. Transactions are indispensable, especially in order systems and banking systems. This article mainly introduces the features of transactions and mysql innoDB's support for transactions. The transaction-related knowledge is far more than mentioned in the article. This article is only intended to inspire others. We hope readers will forgive me for its shortcomings.