Database review 8 -- concurrency, database review 8
Database review CH15 concurrency 15.1 concurrency issues
The I (Isolation, independent) of the ACID nature of the transaction, and the concurrent transactions are transparent. The interspersed transactions will produce multiple data inconsistency problems (different from the Database Inconsistency State ):
- Lost Update problem: after the transaction TA updates a data X, the transaction TB updates the data X, resulting in the TA's Update of X being washed out, that is to say, the X of the "seen" in TA is inconsistent with the X in the database.
- Uncommitted Dependency problem: After TB updates data X, TA reads X/updates X. However, due to some rollback failures, the X that is seen in TA is inconsistent with the X in the database.
- Inconsistent Analysis problem: TA is a transaction that begins with TB (business logic first). However, due to concurrency, the TA that is executed first reads X, then, modify the TB or modify the data that is consistent with the TA but has not been read and will be read by the TA at the execution start time, as a result, the X and Y data that TA sees are inconsistent with the X and Y data during execution.
Note: there are still many problems caused by transaction concurrency. The above is just a classification method of the author of the original book.
15.2 concurrent scheduling and serialization (1) Definition
ConcurrencySchedulingIt is to arrange the execution sequence of all the commands of a concurrent transaction and maintain the operation order of the Internal commands of the transaction. The most common scheduling is serial scheduling, that is, multiple transactions still execute all the commands one by one in a serial order, but the concurrency of serial scheduling is obviously very poor.
Not all scheduling can ensure Database Consistency, but we always look forward to finding a Parallel Scheduling Scheme equivalent to serial Scheduling (that is, Parallel Scheduling without disrupting Database Consistency ).Serializable(Serializable)
Make the following assumptions when considering the serialization of transactions:
- Each transaction maintains Database Consistency, that is, there is no logical error.
- Sequential execution of a series of transactions also maintains database consistency.
- To simplify the process, only read/write operations are considered.
Define the existence of two commands oi and oj between transactionsConflict(Conflict) When and only when oi and oj access the same data item X, there is at least one command to trywrite(X)
; Two scheduling S1 and S2 are definedConflict equivalenceWhen and only when S1 can obtain S2 by exchanging a series of non-conflicting commands; obviouslySerializableConflicts are equivalent to serial scheduling.
(2) serializable Detection
Generally, a transaction sequence diagram (a directed graph between transactions) is constructed to detect the serializability of scheduling. A set of concurrent transactions T1 ,..., Tn. If Ti to Tj has an arc, there are conflicting commands Oi and Oj between Ti and Tj, AND Ti's conflicting command Oi is in front of the scheduling scheme.
After constructing the sequence diagram, judge the serializability of scheduling based on the following theorem:
A scheduling is serializable. When and only when the Sequence Graph is constructed without loops, A Sequence Graph without loops can be used to create a non-conflicting scheduling by means of topological sorting.
(3) Evaluation
Serializable detection occurs after the event and is not suitable for real-time concurrency control! Learning serializability is to better understand the concurrency Protocol below
15.3 lock-based concurrency Protocol
We are no stranger to the lock mechanism. We have discussed in detail the process and thread synchronization in the operating system.
Lock(X)
The simplest lock in concurrency isBinary lock(Also calledMutex lock), A transactionLock(X)
Only wait for itsUnlock(X)
Other transactions can be accessed. Because only one transaction can access the same data in the mutex lock, concurrent read operations are not allowed, which is inefficient.
The more commonly used efficient lock isShare/exclusive lock(Shared/Exclusive Lock, S-X Lock)
(1) Share/exclusive lock
The S-X lock has two states: shared and exclusive:
- Pass
Lock-S
Command to obtain the shared lock. Only data read operations that obtain the shared lock can be performed.
- Pass
Lock-X
Command to obtain the exclusive lock. You can read and write data that has obtained the exclusive lock.
Apply for a S-X lockConcurrency Control ManagerManagement, a transaction must obtain a lock to continue to operate on a data
S-X lock protocols are as follows:
- Transactions must obtain the S lock on t before retrieve tuples t.
- Transactions must obtain the X lock on t before upate tuples t. If there is an S lockUpgradeTo X lock
- If transaction A has an X lock on the tuples t, the application for any lock on t of another transaction B will be rejected.
- If transaction A has a s lock on the tuples t, the application for t's X lock from another transaction B will be rejected, and the application for t's S lock will be allowed.
- If the application for a lock has not been allowed, the transaction will wait until other transactions release the lock it applied.
- Usually, when commit and rollback are used, all locks applied for by the transaction are released by default.
Think carefully about the above description of the S-X lock, it is easy to find that will appearHungerEvenDeadlock(Deadlocked transactions will forcibly release the locks they own and roll back)
(2) Two-phase lock protocol
The Two-Phase Lock (Two Phase Lock) protocol is a protocol that ensures conflicting serializable scheduling solutions, transactions in the two-phase Lock protocol are divided into two phases (the time Point between the two phases is called the Lock Point ):
Two-phase lock cannot be avoidedDeadlockUnless the constraints [All X locks obtained by the transaction must be released at the end of the transaction] are addedStrict two-phase lock protocol
(3) automatic lock acquisition
The general database language does not directly provide explicit lock statements, but embeds the lock protocol into ** atomic ** read and write. The following is an implementation method:
READ(Ti, D) { if Ti has no lock on D { wait until no other T has a X-lock on D; grant Ti a S-lock on D; } read(D);}WRITE(Ti, D) { if Ti has no X-lock on D { wait until no other T has any lock on D; if Ti has a S-lock on D { upgrade S-lock to X-lock; } else { grant Ti a X-lock on D; } } write(D);}
(4) Implementation of the lock mechanism
Generally, the lock implementation is completed by the DBMS lock manager. The transaction sends a lock and release lock request to the lock manager, and the lock manager sends an approval (grant) to the transaction) apply for, reject the application, and force rollback messages (to prevent deadlocks)
Lock manager records lock management-related information in the hash table named lock table and indexes it using tuples. It records the list of transactions that approve the lock (including the types of locks that are approved) and wait
15.5 lock granularity (1) Lock Granularity
The data granularity in the database is from the top to the bottom: database, Region (do not know what this is), files, and tuples, which can be locked for data of different granularities.
Fine GranularityLocking refers to locking data with lower granularity levels, so that higher concurrency can be obtained, but the load of lock management overhead is large;Coarse granularityLock refers to locking data with a higher granularity level, which can obtain lock management with a lower overhead, but with a lower concurrency.
(2) S-X lock granularity expansion-intention lock
Intention lock(Intention Lock) is based on an applicationModerate GranularityAssume that transaction T applies for the X lock on a relational variable R (that is, a table). If global scan is performed to check whether all tuples in R are locked by other transactions, this is costly.
Therefore, in order to detect the lock conflict after such granularity division, we must improve the X-S lock protocol, the following intention lock protocol came into being:
- IS intention sharing lock: T adds the IS lock to a data object of a certain granularity, indicating that T intends to set the S lock for the successor node of this object (Data Object with lower granularity level, if T wants to set the S lock for the R tuples t, add the IS lock to the relational variable R (or add the IS lock to the database where r is located)
- IX intention exclusive lock: T adds an IX lock to a data object of a certain granularity, indicating that T intends to set an X lock for the successor node of the object (lower granularity level data object)
- SIX shared intention exclusive lock: T applies the SIX lock to data objects of a certain granularity, which is equivalent to adding the S lock and the IX lock first. For example, you need to read all the tuples of the relational variable R and modify some tuples, apply for the SIX lock
Multi-granularity extension protocol based on intention lockBasic PrinciplesThat is: T to lock a data object, You need to lock its upper-layer nodes, and apply for a lock from top to bottom, and release the lock from top to bottom.
15.5 deadlock handling
The transaction concurrency manager of DBMS can process deadlocks in two ways:Deadlock PreventionAndDeadlock Detection
(1) Deadlock Prevention
Deadlock PreventionAs its name implies, the database never enters the Deadlock State. Some policies are as follows:
- Requires all transactions to obtain all the required locks before starting the execution.Advance Application
- The order in which transactions lock data depends on [partial order], similar to the Deadlock DetectionWait Diagram
- Set the maximum wait time for the transaction lock. Timeout will roll the transaction.Timeout Policy
(2) Deadlock Detection
Deadlock DetectionYes to detectWait Diagram(Wait-For Graph, a directed Graph describing the transaction lock dependency) whether it is a ring mechanism. When a deadlock is detected, some locks will be rolled or all transactions will be rolled directly.
Wait for the following figure:
- Vertex V represents all active transactions
- Directed edge E = Ti-> Tj indicates that Ti is waiting for Tj to release the lock on a tuple t, and deleted edge E when Ti applies after corner E releases the lock.
We can use a directed graph to generate a ring detection algorithm to detect deadlocks.
Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.