I forgot a detail in the previous article. What exactly does the range T-K represent? Range T-K lock represents a type of lock mode (avoiding Phantom reads) at the serializable isolation level to protect data within a range from being affected by concurrent transactions ). It consists of two parts:
The first part indicates that he locks an index range. In this range, all indexes are locked using the T lock;
The second part is the key that has been hit in this range. These keys will be locked using the K lock.
In this range, the locking mode of the index range and Specific Row is range t-K.
Let's take an example in the previous article:
Select [data] from [mytable] Where [index_column]> = 20 and [index_column] <= 40
The usage of the lock is:
In fact, the lock generated by the preceding statement has two parts: the first is the range s lock, the range is the index range of 20-40, and the second is the S lock used on the key, in the figure, we can see that three keys have been hit, which are the indexes corresponding to "infinity", "25", and "30. Its mode is the range S-S, its type is key, that is, their range lock is range S, key lock is s lock.
Update and insert locks
There are two types of locks involved, one is the range S-U lock and the other is the range X-X lock.
Range S-U, which gets the S lock for the selected index range and the key hit is locked using the U Lock for future conversion to the X lock. When the update is performed, the lock mode in the range Becomes range X-X. Because the updated data columns are different (which may be index columns, but may not), the indexes used are also different (clustering, non-clustering, unique, and so on ), therefore, the situation is not as easy as the range S-S lock so easy to draw the law. In general, there are several situations that are consistent. Here we will not experiment one by one (Here we strongly recommend that you read SQL Server 2008 internals, which describes the lock chapter clearly ):
First, when the index is a unique index, the equal judgment (for example, "=") is used. If this index hits, there will be no range T-K lock record range, and the corresponding record directly gets U Lock or X lock;
Second, the range T-K lock locks the next record if the index does not hit the record, regardless of whether the index is unique or not. (For more information about "Next", see the previous article );
Third, in the range condition (>, <, between), regardless of whether the index is unique, if the index hits, not only the range will get the range T-K lock, the "Next" record for the range also gets the range T-K lock.
Why does the serializable isolation level make deadlocks easier?
We can see from the figure in the first article that the serializable level can ensure the strictest data consistency, but these protection methods can be changed to deadlocks as long as they change slightly. In fact, in various isolation levels, the higher the data consistency, the more likely the deadlock will occur; the lower the data consistency, the less likely the deadlock will occur.
Among these isolation levels, serializable is the easiest to deadlock, thanks to the range T-K lock locking range not only limited to existing data, but also future data; not only limited to several existing data pages, it is a broad range.
Among them, the most terrible problem is"Next"Data lock. This can easily cause a large range of deadlocks. The first example is used to describe:
Select @ findcount = count (ID) from mytablewhere [fk_related_id] = @ argumentif (@ findcount> 0) beginrollback transactionreturn error_codeendinsert into mytable ([fk_related_id],…) Values (@ argument ,...) Commit transactionreturn success_code
In this example, the column fk_related_id in the mytable is a unique index (non-clustered), and the transaction isolation level is serializable. Different stored procedure execution will pass in different @ argument. On the surface, this is not a problem, but due to the lock of the "Next" data, in a slightly higher level of concurrency, there are about 80% failures, all of which come from deadlocks. We selected one of them:
We try to call the stored procedure within the range of [1, 1000] with 15 pressures per second in @ argument. In this process, a record whose @ argument is 115 is successfully inserted first!
ID |
Fk_related_id |
Data |
1 |
115 |
... |
Next, a record whose @ argument is 74 gets the opportunity. Let's assume that its session ID is. It executes the SELECT statement:
ID |
Fk_related_id |
Data |
1 |
115 (a gets the range S-S lock) |
... |
Next, a record with @ argument 4 gets the opportunity. Let's assume that its session ID is B. It executes the SELECT statement:
ID |
Fk_related_id |
Data |
|
115 (a, B get the range S-S lock) |
... |
Next, session a executes the insert statement, then the range S-S lock will try to perform a conversion test (range I-N lock), but this is obviously not feasible, because session B also obtained the range S-S lock, so session a fell into the waiting;
Session B also runs the insert statement. In the same way, it also gets stuck in waiting. In this way, session a waits for session B to give up the range lock, and session B waits for session A to give up the lock, this is a deadlock.
And the worse thing is that all records whose @ argument is less than 115 will try to get the next record a new range S-S lock, so as to enter the infinite waiting, at least, 1-115 records deadlocks, and eventually 114 must be abandoned, 1 success. This is why the serializable isolation level not only deadlocks, but also large-area deadlocks in some cases.
In short:At the serializable isolation level, as long as there is a situation similar to the same index as the condition for first read and then write, there is a high probability of deadlock in a large concurrency, in addition, if the existing record indexes are in a very backward position by sorting rules, a large deadlock may occur.
So how can we solve this problem? Er, lowering the isolation level is certainly a method. For example, if you can accept phantom read, Repeatable read is a good choice. But I suddenly saw the method of using select with updlock in a blog. In fact, this makes deadlocks easier.
For example, if a stored procedure select B and then select a, and another Stored Procedure select a and then select B, the order is different, when the exclusive lock is only read, a deadlock may occur.
So why is Repeatable read much better? Because Repeatable read tightly locks existing records without using the range lock. We still take the stored procedure as an example. In this way, only two locked rows of data are on the same page (because page-level locks are used by default), or close enough, only the records on this data page can be deadlocked, and the deadlock will not affect other records. Therefore, the deadlock probability is greatly reduced.
In our actual test, under the same test conditions, the deadlock failure rate is less than 100 when the concurrency is increased to 0.1%. Of course, we paid the price to allow phantom read.