Database Concurrent transaction control 1: Summary, Database Concurrent transactions Overview
Concurrency control is a key technology of DBMS.
Database operations are performed in transactions.
A transaction is a group of mutually dependent operations. Operations in a transaction are an inseparable unit of work. They are composed of a group of SQL statements that depend on each other in the business logic. They are ACID-characterized.
Atomic (Atomicity): operations contained in a transaction are considered as a logical unit. Operations in this logical unit are either all successful or all failed.
Consistency (Consistency): only valid data can be written to the database, otherwise the transaction should roll back to the initial state.
Isolation: Transactions allow multiple users to concurrently access the same data without compromising data correctness and integrity. At the same time, modifications to parallel transactions must be independent from those of other parallel transactions.
Durability (Durability): after the transaction ends, the transaction processing result must be solidified.
The database has multiple transactions exist at the same time, that is, the transaction concurrency, at this time cannot guarantee the transaction isolation, the SQL-92 defines the transaction isolation level, describes the exposure of a given transaction to other concurrent transactions, or the degree to which a transaction must be isolated from other transactions. Isolation level:
Read Uncommitted, Read Committed, Repeatable Read, Serializable
The higher the isolation level, the more data integrity and consistency can be guaranteed, but the greater the impact on concurrency performance.
When concurrent transactions operate on the same data, conflicts may occur, and data integrity and consistency cannot be fully guaranteed. The following problems may occur:
Lost update (the first type of loss update): overwrites the updated data committed by other transactions when you cancel a transaction.
Dirty Reads: one transaction Reads the uncommitted update data of another transaction.
Phantom Reads: one transaction Reads the newly inserted data committed by another transaction.
Non-repeatable Reads: one transaction Reads the updated data committed by another transaction.
Second lost updates problem (Second type of loss update): this is a special case in non-repeatable read. One Transaction overwrites the updated data committed by another transaction.
Transaction isolation level and possible problems
The key to ensuring the accuracy of concurrent transactions is to serialize the scheduling of conflicting related transactions. Theoretically, the scheduling policy that disables execution of other transactions during execution of a transaction must be serializable. This is the simplest scheduling policy, but it is actually not feasible, because of this, users cannot fully share database resources. Currently, DBMS adopts the blocking method to ensure transaction correctness, that is, to ensure the serializability of parallel operation scheduling. There are other methods, such as the time-scale method and optimistic method.
Pessimistic concurrency control: the locking system prevents users from modifying data in a way that affects other users. If a lock is applied because of the operation performed by the user, other users cannot perform operations that conflict with the lock until the lock owner releases the lock. This method is mainly used in environments where data competition is fierce, and when concurrent conflicts occur, the cost of using locks to protect data is lower than the cost of rollback transactions, therefore, this method is called pessimistic concurrency control.
Optimistic Concurrency Control: In optimistic concurrency control, data is not locked when users read data. When an update is executed, the system checks whether the data has been changed after another user has read the data. If another user updates the data, an error occurs. Generally, the user who receives the error message will roll back the transaction and start again. This method is mainly used in environments with less data competition and in which the cost of occasionally rolling back transactions exceeds the cost of locking data when reading data. Therefore, this method is called optimistic concurrency control.
Time-scale concurrency control: the basic difference between the time-scale and blocking technology is that blocking synchronizes concurrent execution (that is, cross execution) of a group of transactions, it is equivalent to a serial operation of these transactions. The time scale method also uses the cross-execution synchronization of a group of transactions, but makes it equivalent to a specific serial execution of these transactions, that is, an execution determined by the time sequence of the time scale. In the event of a conflict, a transaction is canceled and restarted. When the transaction is restarted, a new time scale is assigned.
The locking protocol alone solves concurrency control problems, including read/write conflicts and write conflicts, which has a great impact on the concurrent transaction performance. Therefore, in practice, postgresql, mysql innodb, oracle, SQL server (configurable, disabled by default), informix and other databases also use multi-version concurrency control (MVCC), and MVCC can avoid read/write conflicts, in this way, you can use the lock to complete the serialized operation of related transactions with write conflicts.
In the subsequent sections, we will continue with the locking mechanism of the specific database and the implementation of MVCC.
-----------------
Please refer to the following source for reprinting:
Blog.csdn.net/beiigang