[Introduction: the basic theories of locks used by various large databases are consistent, but their implementations vary. SQL Server emphasizes that the system manages locks. When a user has an SQL request, the system analysis request automatically adds an appropriate lock to the database between the locking condition and the system performance. At the same time, the system often performs optimization during operation, implement Dynamic locking. Generally, users can use the system's automatic lock management mechanism to meet usage requirements. However, if there are special requirements for data security, database integrity, and consistency, you need to understand the SQL Server lock mechanism and master the database lock methods.]
Locks are a very important concept in databases. They are mainly used to ensure database integrity and consistency in multi-user environments. We know that data inconsistency occurs when multiple users can manipulate data in the same database at the same time. That is, if the transaction is not locked and multiple users access a database at the same time, the problem may occur when their transactions use the same data at the same time. These problems include loss of updates, dirty reads, non-repeated reads, and Phantom reads:
1. When two or more transactions select the same row and update the row based on the originally selected value, the update will be lost. Every transaction does not know the existence of other transactions. The last update will overwrite the updates made by other firms, which will lead to data loss. For example, two editors make an electronic copy of the same document. Each editor independently changes its copy and saves the modified copy to overwrite the original document. Finally, the edited member who saves the change copy overwrites the changes made by the first edited member. This problem can be avoided if the second editor can make changes only after the first editor is complete.
2. dirty read means that when a transaction is accessing data and modifying the data has not been committed to the database, another transaction also accesses the data, then the data is used. Because the data has not been committed, the data read by another transaction is dirty data, and the operations performed based on the dirty data may be incorrect. For example, an editor is changing an electronic document. During the change process, another editor copied the document (this copy contains all changes made so far) and distributed it to the expected users. After that, the first editor thought that the current change was incorrect, so he deleted the edit and saved the document. Documents distributed to users include edited content that no longer exists, and such edited content should never be considered as existent. This issue can be avoided if no one can read the modified document before the first editor determines the final change.
3. Non-repeated read refers to reading the same data multiple times in a transaction. When the transaction is not completed, another transaction also accesses the same data. Therefore, the data read twice in the first transaction may be different because of the modification of the second transaction. In this way, the data read twice in a transaction is different, so it is called non-repeated read. For example, an editor reads the same document twice, but the author overwrites the document between the two reads. The document has been changed when the editor reads the document for the second time. The original reads cannot be repeated. This issue can be avoided if the editor can read the document only after the author has completed writing.
4. Phantom read refers to a phenomenon that occurs when a transaction is not executed independently. For example, the first transaction modifies the data in a table, which involves all the data rows in the table. At the same time, the second transaction also modifies the data in this table. This modification inserts a new row of data into the table. In the future, the user who operates the first transaction will find that there are still data rows in the table that have not been modified, just like an illusion. For example, an editor changes the document submitted by the author, but when the production department merges the changes into the primary copy of the document, the author has added unedited new materials to this document. This issue can be avoided if no one can add new materials to the document before the editors and production departments process the original document.
Therefore, the locking method is used to process concurrent access by multiple users. Lock is a major means to prevent other transactions from accessing specified resource control and implement concurrency control. When a user locks an object in the database, other users cannot access the object. The impact of locking on concurrent access is reflected in the lock granularity. To control locked resources, you should first understand the system space management. In SQL Server 2000, the smallest unit of space management is a page, with a page of 8 K. All data, logs, and indexes are stored on the page. In addition, there is a restriction on the use of pages. This means that a row of data in the table must be on the same page and cannot be cross-page. The space management unit on the page is the disk area, and a disk area is eight consecutive pages. The smallest unit used by tables and indexes is the disk area. A database consists of one or more tables or indexes, that is, multiple disk partitions. The lock placed on a table restricts concurrent access to the entire table; the lock placed on the disk area limits access to the entire disk area; the lock placed on the data page limits access to the entire data page; the lock placed on the row only limits concurrent access to the row.
SQL Server 2000 has multi-granularity locking, allowing a transaction to lock different types of resources. To minimize the cost of locking, SQL Server automatically locks resources at the appropriate task level. Locking at a small granularity (such as rows) can increase concurrency but requires a large overhead, because if many rows are locked, more locks need to be controlled. Locking at a large granularity (such as a table) is expensive in terms of concurrency, because locking the entire table limits access to any part of the table by other transactions, but requires a low overhead, because the number of locks to be maintained is small. SQL Server can lock rows, pages, extended disk areas, tables, databases, and other resources.
Rows are the smallest space that can be locked, and row-level locks consume the least amount of data resources. Therefore, during transaction processing, other transactions are allowed to continue to manipulate other data in the same table or on the same page, it greatly reduces the waiting time of other transactions and improves the concurrency of the system.
Page-level locks are used to lock a page at a time regardless of the amount of data processed by a transaction. Data on this page cannot be manipulated by other transactions. Before SQL Server 7.0, a page lock was used. There are more page-Level Lock resources than Row-Level Lock data resources. In a page-Level Lock, even if a transaction operates only one row of data on the page, other data rows on the page cannot be used by other transactions. Therefore, when page-level locks are used, data is wasted, that is, data is occupied but not used on the same page. In this case, the data waste cannot exceed the data rows on a single page.
Table-level locks are also very important. Table-Level Lock refers to the entire table in which the data of a table is locked when the transaction operates. Other transactions cannot access other data in the table. Table-level locks are generally used when the volume of data processed by a transaction is large. Table-level locks use a small amount of system resources, but occupy a large amount of data resources. Compared with row-level locks and page-level locks, table-level locks consume less system resources, such as memory, but consume the largest data resources. During table-level locks, a large amount of data may be wasted. Because table-level locks lock on the entire table, other transactions cannot manipulate other data in the table.
A disk lock is a special type of lock and can only be used in some special cases. A cluster-level lock means that a transaction occupies a disk zone, which cannot be occupied by other transactions at the same time. For example, this type of lock is used when the system allocates physical space during database creation and table creation. The system allocates space according to the disk area. When the system allocates space, the disk zone lock is used to prevent other transactions from using the same disk zone at the same time. After the system allocates space, this type of disk zone lock is no longer used. In particular, the disk zone lock is not used when transactions involving data operations are involved.
Database-level locks are used to lock the entire database and prevent any user or transaction from accessing the locked database. A database-Level Lock is a very special lock, which is only used during database recovery operations. A lock of this level is the highest level because it controls the operation of the entire database. As long as the database is restored, you need to set the database to the single-user mode, so that the system can prevent other users from performing various operations on the database.
Row-level locks are the optimal locks, because row-level locks cannot be wasted because data is occupied and not used. However, if your transactions frequently operate on multiple records in a table, row-level locks will be applied to many record rows in the table, and the number of locks in the database system will increase dramatically, this increases the system load and affects system performance. Therefore, SQL Server also supports lock escalation ). The so-called lock upgrade refers to adjusting the lock granularity and replacing multiple low-granularity locks with a few higher-granularity locks to reduce the system load. In SQL Server, when there are many locks in a transaction and the lock upgrade threshold is reached, the system automatically upgrades row-level locks and page locks to table-level locks. In SQL Server, the threshold for Lock upgrade and lock upgrade are automatically determined by the system and do not need to be set by the user.
When locking an SQL Server database, in addition to locking different resources, you can also use different degrees of locking, that is, there are multiple lock modes. The locking modes in SQL Server include:
1. in SQL Server, shared locks are used for all read-only data operations. The shared lock is exclusive and allows multiple concurrent transactions to read the locked resources. By default, after data is read, SQL Server immediately releases the shared lock. For example, when "select * from authors" is executed, the first page is locked first. After reading the page, the first page is released and the second page is locked. In this way, you can modify the first page that is not locked during the reading operation. However, both the transaction isolation level connection option setting and the lock setting in the SELECT statement can change the default setting of SQL Server. For example, "select * from authors holdlock" requires that the table be locked during the entire query process until the query is complete.
2. The update lock is used to lock resources that may be modified during the initialization phase of the modification operation. This prevents deadlocks caused by the use of shared locks. When a shared lock is used, the data modification operation is divided into two steps. First, a shared lock is obtained, data is read, the shared lock is upgraded to an exclusive lock, and then the modification is performed. In this way, if two or more transactions apply for a shared lock for one transaction at the same time, these transactions must upgrade the shared lock to an exclusive lock when modifying data. At this time, these transactions will not release the shared lock, but will wait for the other party to release, resulting in a deadlock. If you apply for an update lock before data modification and upgrade it to the exclusive lock when data is modified, the deadlock can be avoided.
3. The exclusive lock is retained to modify data. The resources it locks. Other transactions cannot be read or modified.
4. The schema lock is used when the schema lock is used to execute the table's Data Definition Language (DDL) operation (such as adding a column or removing a table. When compiling a query, the schema stability (Sch-S) Lock is used. The schema stability (Sch-S) Lock does not block any transaction lock, including the exclusive lock. Therefore, when compiling a query, other transactions (including those with an exclusive lock on the table) can continue to run. However, you cannot perform DDL operations on tables.
5. Intention lock intention lock indicates that SQL Server has the intention to obtain the shared or exclusive lock at the lower layer of the resource. For example, table-level shared intention locks indicate the transaction intention to release the exclusive locks to the pages or rows in the table. Intention locks can be divided into shared intention locks, exclusive intention locks, and shared exclusive intention locks. The shared intention lock indicates that the transaction intention is to place a shared lock on the lower-level resources locked by the shared intention lock to read data. Exclusive intention lock indicates that the transaction intention is to place an exclusive lock on the lower-level resources locked by the shared intention lock to modify data. The shared exclusive lock indicates that the transaction allows other transactions to use the shared lock to read top-level resources and intends to place the exclusive lock on the lower layer of the resource.
6. A large-capacity update lock is used when the data is copied to a table in large capacity and the tablock prompt is specified or the table lock on bulk table option is set using sp_tableoption. The large-capacity update lock allows a process to concurrently copy large data volumes to the same table. It also prevents other processes that do not perform large-capacity data replication from accessing the table.
In the SQL server system, it is recommended that the system automatically manage the locks. The system will analyze the user's SQL statement requirements and automatically add the appropriate locks for the request. When the number of locks is too large, the system automatically performs lock upgrade. As mentioned above, the upgrade threshold is automatically configured by the system and does not require user configuration. In practical applications, sometimes in order to correctly run the application and maintain data consistency, you must manually lock a table in the database. For example, in a transaction operation of an application, you need to perform statistical operations on several data tables according to the ID. To ensure the consistency and correctness of the statistical data time, from the first table in statistics to the end of all tables, other applications or transactions cannot write data to these tables. At this time, the application wants to artificially (explicitly) Lock the tables starting from the first data table or starting the entire transaction, this requires the use of manual locking (also known as explicit locking) technology.
You can use select, insert, update, and delete statements to specify the range of table-level locking prompts to guide Microsoft SQL Server 2000 to use the required lock types. When you need to control the lock type obtained by the object in more detail, use the table-Level Lock prompt to change the default lock behavior.
The specified table-Level Lock prompts are as follows:
1. holdlock: Keep the shared lock on the table until the entire transaction ends, instead of releasing the added lock immediately after the statement is executed.
2. nolock: no shared or exclusive locks are added. When this option takes effect, uncommitted data or "dirty data" may be read. This option is only applicable to select statements.
3. paglock: Specify to add a page lock (otherwise, a table lock may be added ).
4. readcommitted performs scanning with the same lock semantics as transactions running at the committed read isolation level. By default, SQL Server 2000 operates at this isolation level ..
5. readpast: Skip the data rows that have been locked. This option will allow the transaction to skip the data rows that have been locked by other transactions when reading data, rather than blocking until other transactions release the lock, readpast is only applicable to select statement operations in transaction operations at the Read committed isolation level.
6. readuncommitted: equivalent to nolock.
7. repeatableread: Set the transaction to a read-only isolation level.
8. rowlock: Use row-level locks instead of coarse-grained page-level locks and table-level locks.
9. serializable: scan with the same lock semantics as transactions running at the serializable read isolation level. Equivalent to holdlock.
10. tablock: Table-level locks are used instead of Row-level or page-level locks. After the statement is executed, SQL Server releases the lock. If both holdlock is specified, the lock remains until the transaction ends.
11. tablockx: Specifies the use of the exclusive lock on the table. This lock can prevent other transactions from reading or updating data in the table until the statement or the entire transaction ends.
12. updlock: Specify the update lock when reading data in the table, instead of the shared lock. The lock remains until this statement or the entire transaction ends, updlock is used to allow users to read data first (and does not block other users from reading data), and ensure that when the data is updated later, during this period, the data was not modified by other users.
Deadlock
In the database system, a deadlock occurs when multiple users (processes) Lock a resource respectively and attempt to request to lock the resources that the other user has locked. This produces a lock request ring, as a result, multiple users (processes) are waiting for the other party to release the locked resources. This deadlock is the most typical form of deadlock. For example, there are two transactions A and B at the same time, and transaction A has two operations: locking the table part and requesting access table supplier; transaction B also has two operations: locking the table supplier and requesting the Access Table part. As a result, a deadlock occurs between transaction a and transaction B.
The second case of deadlock is that when in a database, several long-running transactions execute parallel operations, when the query analyzer processes a very complex query, such as a connection query, the processing sequence cannot be controlled, and deadlock may occur.
In SQL Server, the system can automatically search for and handle deadlocks on a regular basis. The system identifies all Process sessions waiting for the lock request in each search. If the identified process is still in the waiting state in the next search, SQL Server starts the recursive deadlock search. When the lock request ring is detected in the search, SQL Server automatically selects a thread (deadlock victim) that can break the deadlock to end the deadlock. SQL Server rolls back the transaction that is the victim of a deadlock, notifies the application of the thread (by returning error message No. 1205), cancels the current request of the thread, and then allows the transaction of the uninterrupted thread to continue. Generally, SQL Server selects the thread that has the least fees for running the Undo transaction as the deadlock victim. In addition, you can use the set statement to set the deadlock_priority of the session to low. The deadlock_priority option controls how sessions are valued in a deadlock. If the session is set to low, the session will become the first victim when the session is deadlocked.
After understanding the concept of deadlock, you can use the following methods in the application to avoid deadlock: (1) Reasonably arrange the table access sequence. (2) Avoid user intervention in transactions as much as possible, minimize the number of tasks processed by a transaction, and keep the transaction short and in a batch. (3) Time-Domain discretization for data access. The time-domain discretization for data access refers to various control measures used in the Client/Server structure to control the time period for accessing objects in the database or database. The following methods are used to reasonably arrange the execution time of background transactions and manage background transactions using workflows. When a workflow manages a task, it restricts the number of threads of the same type of tasks (usually one) to prevent excessive resource occupation. On the other hand, it rationally schedules the time series and times of running different tasks, avoid executing multiple background tasks at the same time. In addition, avoid running background tasks at the front-end transaction peak hours. (4) data storage space discretization method. The data storage space discretization method is used to distribute data in a logical table to several discrete spaces, so as to improve the table access performance. The following methods are used: first, a large table is divided into several small tables by row or column. Second, it is divided by different user groups. (5) use the isolation level as low as possible. Isolation level refers to the degree of multi-user transaction isolation to ensure the integrity and consistency of database data. sql92 defines four isolation levels: uncommitted read, committed read, Repeatable read, and serializable. If a high isolation level is selected, such as serializability, although the system can guarantee data integrity and consistency to a greater extent for better isolation, however, the chance of deadlocks due to conflicts between transactions is greatly increased, which greatly affects the system performance. (6) use the BIND connection to allow two or more transaction connections to share transactions and locks, and any transaction connection must apply for a lock as if another transaction wants to apply for a lock, therefore, you can allow these transactions to share data without locking conflicts.
In short, it is very important for a qualified DBA to understand the locking mechanism of SQL Server and master the database locking method.