Concurrency control is the key technology of DBMS
Operations on the database are performed in a transaction.
A transaction is a set of interdependent action behaviors. Operations in a transaction are indivisible units of work, consisting of a set of SQL statements that are interdependent on the business logic, with acid characteristics.
Atomic (atomicity): The operations contained in a transaction are considered to be a logical unit in which the operations of the logical unit either succeed or fail altogether.
Consistency (consistency): only legitimate data can be written to the database, or the transaction should roll it back to its original state.
Isolation (Isolation): Transactions allow multiple users to concurrently access the same data without destroying the correctness and integrity of the data. At the same time, the modification of parallel transactions must be independent of the modifications of other parallel transactions.
Durability (persistent): After the transaction ends, the result of the transaction must be cured.
There are multiple transactions in the database at the same time, that is, transaction concurrency, there is no guarantee of transaction isolation, SQL-92 defines the transaction isolation level, describes the behavior of a given transaction to other concurrent execution transactions exposure, or the degree that a transaction must be isolated from other transactions. The isolation level is from low to High:
Read Uncommitted,read committed,repeatable read, Serializable
The higher the isolation level, the greater the integrity and consistency of the data, but also the greater the impact on concurrency performance.
The following issues may occur when concurrent transactions operate with the same data and conflict and do not fully guarantee the integrity and consistency of the data:
Lost Update (First class missing update): When you revoke a transaction, overwrite the updated data that was committed by the other transaction.
Dirty Reads (Dirty Read): One transaction is read to another transaction uncommitted update data.
Phantom Reads (Virtual Read): One transaction reads the newly inserted data that has been committed by another transaction.
Non-repeatable Reads (non-repeatable Read): One transaction reads the updated data that has been committed by another transaction.
Second Lost updates problem (second category missing updates): This is a special case in non-repeatable reads, where one transaction overwrites the updated data submitted by another transaction.
Transaction isolation levels and issues that may occur
The key to ensuring the accuracy of concurrent transactions is to make the conflicting related transaction scheduling serializable. In theory, a scheduling policy that prohibits execution of other transactions at the time of execution of a transaction must be a serializable schedule, which is the simplest scheduling strategy, but is not practical, because it prevents users from fully sharing database resources. At present, the DBMS generally adopts the method of blocking (pessimistic method, Postgresql,mysql,oracle,sql server, etc.) to ensure the correctness of the transaction, that is, to ensure the serialization of parallel operation scheduling. In addition, there are other methods, such as time scale method, optimistic method and so on.
Pessimistic concurrency control: The locking system prevents users from modifying data in a way that affects other users. If the user performs an action that causes a lock to be applied, no other user can perform an action that conflicts with the lock until the lock is released by the owner of the lock. This method is mainly used in the environment of intense data contention, and the cost of protecting data with locks when concurrency conflicts occur is lower than the cost of rollback transaction, so this method is called pessimistic concurrency control.
Optimistic concurrency control: in optimistic concurrency control, data is not locked when the user reads the data. When you perform an update, the system checks to see if the data has changed after another user has read the data. If another user updates the data, an error occurs. In general, the user who receives the error message rolls back the transaction and starts over again. This approach is mainly used in environments where data contention is low, and occasionally the cost of rolling back a transaction exceeds the cost of locking the data when the data is read, so the method is called optimistic concurrency control.
Time-scale concurrency control: The basic difference between time-scale and blocking technology is that blocking is synchronizing concurrent execution (that is, cross-execution) of a set of transactions, using it to be equivalent to a serial operation of those transactions, and the time-scale method is also used to synchronize the cross-execution of a set of transactions, but makes it equivalent to a specific serial execution An execution that is determined by the timing of the time scale. If a conflict occurs, it is resolved by revoking and restarting a transaction. When the transaction restarts, the new time-scale is assigned.
The simple locking protocol solves concurrency control problems with read-write conflicts and write-write conflicts, and has a large impact on concurrent transactional performance. So in practice Postgresql,mysql Innodb,oracle,sql Server (can be set, default off), Informix and other databases also use multi-version concurrency control (MVCC), MVCC can avoid read and write conflicts, This is done by using the lock-and-write conflict-related transaction serialization operation.
The subsequent section continues the locking mechanism of the specific database and the implementation of the MVCC.
-----------------
Reprint Please specify the source:
Blog.csdn.net/beiigang
Database concurrency Transaction Control one: overview