In any multiuser database, there must be a consistent set of rules for data modification, and when two different processes attempt to modify the same data at the same time, the database management system (DBMS) is responsible for resolving potential conflicts between them. Any relational database must support the ACID properties of a transaction, so before you begin to understand the lock, you should first understand the ACID properties of database transactions and transactions.
- Atomicity (atomicity): atomicity means that transactions in a database are executed as atoms. That is, the entire statement is either executed or not executed.
- Consistency (consistency): The integrity constraints of the database are not compromised until the transaction begins and after the transaction has ended. (Unique constraints, foreign KEY constraints, check constraints, etc.) and trigger settings This is guaranteed by SQL Server
- Isolation line (Isolation): The execution of a transaction is non-disruptive, and a transaction cannot see the data in the middle of a time when other transactions are running
- Persistence (durability): persistence means that, during the execution of a transaction, the changes made to the data must be saved by a physical storage device before the transaction is completed successfully, which guarantees that the modifications will not be lost if any system is paralyzed.
Theoretically, all transactions should be completely isolated from one another. In practice, however, the cost of full isolation is too high (must be a serialized isolation level to be completely isolated). So, SQL Server through the lock, like the intersection of traffic lights, tell all the concurrent connections, at the same time, those resources can be read, those resources can be modified. When a transaction requires access to a resource that is not compatible with the lock, SQL Server blocks the current transaction to achieve the so-called isolation. Until the lock on the requested resource is freed.
To do this, SQL Server chooses Read commited as the default isolation level for the database between isolation and concurrency.
Concurrent operations on a database by multiple users can lead to inconsistencies in the following data:
Dirty reads: One transaction reads data that is not committed by another transaction.
A modifies the data, then B reads the data, but a for some reason cancels the modification of the data, the data restores the original value, at this time the data obtained by B is inconsistent with the data in the database.
Phantom read: The same transaction, read two times with the same operation, the number of records obtained is not the same.
A reads the data, and then B inserts the data, at which point a reread data is found inconsistent with the data rowset obtained two times before and after the discovery
Non-repeatable READ: In the same transaction, read the same data two times to get different content.
A user reads the data and then the B user reads the data and modifies it, at which point a user reads the data and finds that the values are inconsistent two times before
Missing updates: Transaction T1 reads the data, performs some actions, and then updates the data. Transaction T2 Do the same thing, T1 and T2 may overwrite each other's updates when updating the data, causing an error.
A, a, two users read the same data and modify it, and one of the user's modifications destroys the result of another modification, such as a booking system
The main method of concurrency control is to prevent users from doing certain operations over a period of time by locking to avoid inconsistent data understanding the isolation level in SQL Server
To avoid the effects of the above-mentioned transactions, SQL Server avoids varying degrees by setting different isolation levels. SQL Server provides 5 options to avoid the impact between different levels of transactions. The isolation levels are from low to high, respectively:
- Uncommitted read (read uncommited): highest performance, but may appear dirty read, non-repeatable read, Phantom read
- Read commited: may appear non-repeatable read, Phantom read
- REPEATABLE READ (Repeatable Read): Phantom reads may occur
- Serialization (Serializable): lowest performance, range lock causes concurrency degradation
- Snapshot (snopshot): This is done by creating an additional copy in tempdb to avoid dirty reads, non-repeatable reads, which can impose additional burdens on tempdb
Mode of Lock
- shared lock (S lock): used to read the lock added by the resource. A resource that has a shared lock cannot be modified. Shared locks are read by default and are released immediately.
- exclusive Lock (x Lock): incompatible with any other lock, including other exclusive locks. Exclusive locks are used for data modification, and when an exclusive lock is added to a resource, other transactions that request to read or modify the resource are blocked until the exclusive lock is released.
- update lock (U Lock): The U-Lock can be seen as a combination of S-lock and X-lock, used to update the data, first need to find the updated data when updating data, it can be understood that the data is found on the S-lock. When the data that needs to be modified is found, the X lock on the resource being modified is required. SQL Server avoids deadlock problems with u locks. Because S-Lock and S-Lock are compatible, the U-Lock and S-lock are compatible to make the update lookup not affect the data lookup, and the U-Lock and U-Lock are not compatible, thus reducing the possibility of deadlock.
- Intent Lock (is IX IU): Intent Lock is more like an indicator than a lock. In SQL Server, resources are hierarchical, one table can contain n pages, and one page can contain n rows. When we add a lock to a row. A page that can be understood as containing this line, and a part of the table has been locked. When another query needs to lock the page or table, it's a bit too painful to see whether the page and the data contained in the table are locked in a row. Therefore, when SQL Server locks a resource with a lower granularity, it adds an intent lock on its parent resource, telling other queries that a part of the resource is locked. For example, when we update a row in a table, both the page and the table are given an intent exclusive lock
- Snapshot (snopshot): This is done by creating an additional copy in tempdb to avoid dirty reads, non-repeatable reads, which can impose additional burdens on tempdb
- Key-range Lock (Key-range): When using the serializable transaction isolation level, a key-range lock can implicitly protect the range of rows contained in a recordset that is read by a Transact-SQL statement. The Serializable isolation level requires that the query must obtain the same rowset whenever any query is executed during a transaction. A key-range lock prevents other transactions from inserting new rows whose key values are in the range of key values that can be read by the serializable transaction, ensuring that this requirement is met.
Key-range locks prevent Phantom reads. By protecting the key range between rows, it also prevents Phantom insertions to the recordset accessed by the transaction.
Key-range locks are placed on the index, specifying the start and end key values. This lock will block any attempts to insert, update, or delete any rows with key values in that range, because these operations will first get the locks on the index. For example, a serializable transaction might emit a SELECT statement to read all rows whose key value is between ' AAA ' and ' Czz '. A key-range lock on a key value in the range of ' AAA ' to ' czz ' prevents other transactions from inserting rows with key values within that range, such as ' ADG ', ' BBD ', or ' CAL '.
- Schema Lock: SQL Server uses schema locks to maintain the integrity of the table structure. Unlike other types of locks that provide data isolation, schema locks provide a schema isolation for database objects such as tables, views, and indexes in a transaction.
- Bulk Update Lock: used when bulk data is replicated to a table and TABLOCK hints are specified
Lock compatibility
The size of the lock
The so-called granularity, in essence, is, in order to provide complete isolation and serialization of transactions, the amount of data that is locked as part of the query or update (the size). Lock manager needs to strike a balance between the concurrent access to resources and the administrative overhead of maintaining a large number of low-level locks. For example, the smaller the granularity of locks, the greater the number of concurrent users who can access the same table concurrently, but the greater the administrative overhead of maintaining those locks. The greater the granularity of the lock, the less overhead is required to manage the lock, and the concurrency is reduced. The trade-offs between the size of the lock and concurrency are illustrated.
The lock granularity supported by SQL Server can be divided into lock escalation for rows, pages, keys, key ranges, indexes, tables, or databases
Lock escalation is the process of converting many finer-grained locks into less coarse-grained locks, which reduces overhead, but increases the likelihood of concurrent contention.
When the SQL Server database engine acquires a low-level lock, it also places an intent lock on the object that contains the lower-level object:
- When a row or index key range is locked, the database engine places an intent lock on the page that contains the rows or keys.
- When a page is locked, the database engine places an intent lock on the higher-level object that contains the pages.
In addition to the intent lock on an object, an intent page lock is required on the following objects: Leaf page of nonclustered index, data page of clustered index, heap data page.
Threshold value for lock escalation:
- A single Transact-SQL statement obtains at least 5,000 locks on a single, non-partitioned table or index.
- A single Transact-SQL statement obtains at least 5,000 locks on a single partition of the partitioned table, and the ALTER table set lock_escalation option is set to AUTO.
- The number of locks in the database engine instance exceeds the memory or configuration thresholds
TIPS: The database engine does not upgrade row or key-range locks to page locks, but instead upgrades them directly to table locks. Similarly, page locks are always upgraded to table locks.
-- View lock activity Select from sys.dm_tran_locks-- view transaction activity status DBCC OPENTRAN
Refer to: https://docs.microsoft.com/zh-cn/sql/relational-databases/system-dynamic-management-views/ sys-dm-tran-locks-transact-sql?view=sql-server-2017
What is a deadlock?
What is a deadlock?
The essence of deadlock is a stalemate, which is caused by the contention of multiple subjects for resources. In two or more tasks, if each task locks a resource that other tasks are trying to lock, it causes these tasks to become permanently blocked, resulting in a deadlock. Understanding deadlocks first requires an understanding of the relevant concepts involved in deadlocks.
In the example, each team has a road, but they all need another car to occupy the other road, so blocking each other, no one can move forward, resulting in a deadlock.
Causes of deadlock and four necessary conditions
The main causes of deadlocks are:
(1) Due to insufficient system resources.
(2) The sequence of progress of the process is not appropriate.
(3) Improper allocation of resources and so on.
If the system has sufficient resources, the resource requests of the process can be met, the likelihood of deadlocks is very low, otherwise it will be locked into a deadlock because of the contention for limited resources. Second, the process is run in a different order and speed, and may also produce a deadlock.
The four necessary conditions for creating a deadlock:
(1) Mutex condition: A resource can only be used by one process at a time.
(2) Request and hold condition: When a process is blocked by a request for resources, it remains in place for the resources that have been obtained.
(3) Conditions of deprivation: the resources that the process has acquired cannot be forcibly deprived of until the end of its use.
(4) Cyclic waiting condition: a cyclic waiting resource relationship is formed between several processes.
These four conditions are necessary for the deadlock, as long as the system has a deadlock, these conditions must be established, and as long as one of the above conditions is not satisfied, there will be no deadlock. Two types of deadlocks
1. Cyclic deadlock: Two processes request locks on different resources, each process requires a lock on the resource held by the other, and a cyclic deadlock occurs. Such as
2. Convert deadlocks: Two or more processes hold a shared lock on the same resource in the transaction and want to upgrade it to an exclusive lock, but no one can upgrade until the other process releases the shared lock, as
Some scenarios in SQL Server that generate deadlocks
1. Deadlock created by Bookmark lookup: This type of deadlock is caused by a deadlock in bookmark lookup and update data. Simply put, because the UPDATE statement generates an X lock on the base table, and then needs to update the index on the table, and the index on the table is just being looked up by another connection, and the S lock is added, and then the bookmark lookup goes to the base table with the X-lock data for bookmark lookup, which forms the deadlock
Bookmark Lookup: When the query optimizer uses a nonclustered index for lookups, a lookup (lookup) is required to retrieve additional fields to satisfy the request if the columns in the selected column or query condition are only partially contained in the nonclustered and clustered indexes used. A key lookup for a table with a clustered index (key lookup) is a RID lookup (RID lookup) for a heap table, which is a lookup-bookmark lookup. Simply put, a bookmark lookup occurs when you use the SQL query criteria and the columns returned by select are not fully contained in the index column.
Solution: This deadlock can reduce the probability of this type of deadlock occurring by reducing the bookmark lookup through the Include column.
2. Deadlock generated by foreign key: This type of deadlock is caused by a foreign key constraint. When the primary table (that is, the primary key is the table from the Foreign key table) updates the data, you need to view the foreign KEY constraint from the table to determine the foreign key column from the table. An x lock is added to the main table, but this does not prevent the same time, and another SPID adds the modified primary table key to the table, in order to solve this problem, SQL Server uses a range lock when making such an update, which is only available when the isolation level is serialized. So at this point, although the isolation level may be the default committed read, the behavior is serialized. This will most likely lead to deadlocks.
Solution: Add an index to the foreign key column so that the range lock is added to the index instead of the table itself. This reduces the probability of a deadlock occurring.
3, due to the improper promotion of the sequence of deadlocks: in multiple transactions on the use of resources in the wrong order, the formation of a deadlock loop caused by.
Solution: Try to match the order in which resources are used. This is one of the most common cases of deadlock problems.
How to view deadlocks
Third, the prevention and optimization of deadlock
Prevent deadlocks
Preventing deadlocks is destroying one or more of the four necessary conditions so that they do not form a deadlock. There are several ways to do this:
1) break mutually exclusive conditions
There are strict restrictions on breaking mutex conditions, and in SQL Server, if dirty reads are allowed on the business logic, you can change the isolation level to read UNCOMMITTED or use index hints. This allows the read to not add S lock, thus avoiding the other query with the S lock incompatible with the lock mutex, and thus reduce the probability of deadlock occurrence.
2) Destroy requests and wait conditions
This is not disruptive because the transaction is atomic, because the solution is to minimize the length of the transaction, and the faster the transaction executes the better. This also reduces the probability that a deadlock will occur.
3) Destruction of non-deprivation conditions
Because of the atomicity and consistency of the transaction, the conditions of deprivation are equally non-destructive. But we can consider it by increasing resources and reducing resource consumption by two angles.
Add resources: For example, by creating a nonclustered index, which makes additional resources available, queries often stop asking for lock base tables, instead of locking the nonclustered indexes, which is better if the index can "overwrite (Cover)" queries. Therefore, the index include column does not only reduce bookmark lookups to improve performance, but also reduces deadlocks.
Reduce Resource Usage: for example, when querying, you can use select Col1,col2 this way, do not use SELECT *. This may lead to unnecessary bookmark lookups
Ways to minimize deadlocks
- Access Objects in the same order: the object is accessed in the same order: After the first transaction commits or rolls back, the second transaction continues, so that no deadlock occurs.
- avoid user interaction in transactions: Avoid writing transactions that contain user interaction, because running batches without user interaction is much faster than the user responds to queries manually, such as prompting for replies to application request parameters. For example, if a transaction is waiting for user input, and the user goes to lunch or even goes home for the weekend, the user suspends the transaction so that it cannot be completed. This reduces the throughput of the system because any locks held by the transaction are freed only when the transaction commits or rolls back. Even if there is no deadlock, other transactions that access the same resource are blocked waiting for the transaction to complete.
- Keep transactions short and in one batch: deadlocks typically occur when multiple concurrently executing transactions in the same database are required to run for long periods of time. The longer a transaction runs, the longer it takes to hold an exclusive lock or update lock, blocking other activities and potentially causing deadlocks. Keep transactions in one batch, you can minimize the network traffic round-trip of a transaction, reduce the possibility of completing a transaction, and release the lock.
- use low isolation level: Determines whether a transaction can run at a lower isolation level, and performing a commit read allows a transaction to read data that another transaction has read (unmodified) without having to wait for the first transaction to complete. Using a lower isolation level (such as commit read) without using a higher isolation level (such as serializable) can shorten the time to hold a shared lock, reducing lock contention.
- Use bound connections: use a bound connection to enable two or more connections open by the same application to work with one another. Any lock acquired by a secondary connection can be held as a lock obtained by the primary connection, and vice versa, so it does not block each other.
Some suggestions for optimizing deadlocks
(1) Use clustered indexes as far as possible for frequently queried tables;
(2) Try to avoid SQL statements that affect a large number of records at once, especially insert and UPDATE statements;
(3) Try to get the update and DELETE statements to use the appropriate index;
(4) Avoid committing and fallback conflicts when using nested transactions;
(5) Use of queries with low data consistency requirements with (NOLOCK)
(6) Reduce the volume of transactions, the transaction should be opened at the latest, the earliest shutdown, all operations that do not have to use transactions must be placed outside the transaction.
(7) The query returns only the columns you need, and it is not recommended to use the SELECT * from notation.