[Study Notes] dealing with concurrent access issues at the database level

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

References:
Basic knowledge of Database Transaction Management
Sqlserver Lock Mechanism
Understanding locking in SQL Server
In-depth analysis of SQL server2008-transaction and concurrency

Two concurrency Models

Pessimistic concurrency

The database performance is pessimistic. It is always assumed that there will be conflicts. If the system has enough data modification operations, any given read operations may be affected by the data modification operations of another user. The pessimistic concurrency model acquires a lock on the data being read, so that other processes cannot modify the data to avoid conflicts. It also obtains the lock on the data being modified, so that other processes cannot access the data. In a pessimistic concurrency environment, the reader blocks the writer, and the writer blocks the reader and other writers.

Optimistic Concurrency

Optimistic Concurrency assumes that there are enough conflicting data modification operations in the system, so any single transaction is unlikely to modify the data being modified by another transaction. A Data Reading process can read the data version of the process that was just started to read. The reader does not block the writer, and the writer does not block the reader.

Basic concepts of transactions

A transaction is a logical unit of work that includes a series of operations. Transaction Processing ensures that data-oriented resources are not updated permanently unless all operations in the transaction unit are successfully completed. By combining a set of related operations into a unit that either succeeds or fails, you can simplify error recovery and make the application more reliable.

Transactions have four basic features, namely acid, which we often call. These include:
1. Atomic (atomicity, the "Atom" here indicates that each operation in the transaction is inseparable)
The operations contained in a transaction are considered as a logical unit. The operations in this logical unit either succeed or fail.
2. Consistency (consistency)
Consistency means that only valid data will be written to the database, otherwise it will roll back to the initial state. Transactions ensure that the database status changes from one consistent state to another consistent state.
3. Isolation (isolation)(Also translated as "independence ")
When multiple transactions are concurrently executed, the execution of one transaction should not affect the execution of other transactions. The data seen by a transaction is either the status before the transaction is modified by another transaction or the data that has been modified by the second transaction. However, the transaction cannot see the data being modified. This feature is also known as seriality.
4. Durability)
The changes made to the database by committed transactions should be permanently stored in the database, even if a system failure occurs. For example, if a sudden power failure occurs before the transaction is committed, ensure that the entire transaction is rolled back when the system is restarted. If a power failure occurs after the transaction is committed, the transaction results should be permanently stored in the database.

Exceptions During Concurrent Transaction Processing

1. Lost update)
Transaction 1: update a record.
Transaction 2: update the records in transaction 1.
Transaction 1: Call commit for commit.
Transaction 2: Call commit for commit.
In this case, the modification made by transaction 1 is completely overwritten by the modification of transaction 2, which is called the loss of update.
When two or more transactions select the same row and update the row based on the originally selected value, the update will be lost. Because every transaction does not know the existence of other transactions, the final update will overwrite the updates made by other transactions, which will lead to data loss.

2. Dirty read: A transaction reads data not committed by another transaction, so you may see data that is finally rolled back by another transaction.
Transaction 1: update a record.
Transaction 2: read the records updated by transaction 1.
Transaction 1: Call commit to commit or call rollback for rollback.
In this case, transaction 2 reads data stored in the database memory, which is called dirty read. The read data is dirty data.
For more information, see Dirty read. When a transaction accesses data and modifies the data, the modification has not been committed to the database, another transaction also accesses the data and then uses the data. Because the data has not been committed, the data read by another transaction is dirty data, and the operations performed based on the dirty data may be incorrect.

3. nonrepeatable read: In the same transaction, the same data is read twice and the obtained content is different. That is, when a transaction reads data that has previously been read again, it finds that the data has been modified by another committed transaction.
Transaction 1: query a record.
Transaction 2: update the records queried by transaction 1.
Transaction 2: Call commit for commit.
Transaction 1: query the last record again.
In this case, transaction 1 queries the same data twice, and the obtained content is different, which is called non-repeated read.

4. Phantom read: In the same transaction, the same operation is used to read the data twice and the number of records is different. That is, a transaction re-executes a query and returns a batch of records that meet the query conditions, but these records contain new records generated by other recently committed transactions.
Transaction 1: Query all records in the table
Transaction 2: insert one or more records
Transaction 2: Call commit for commit
Transaction 1: Query all records in the table again
At this time, the record queried twice by the transaction is different, which is called phantom read.
Explanation: phantom read refers to a phenomenon that occurs when a transaction is not executed independently. For example, the first transaction modifies the data in a table, which involves all the data rows in the table. At the same time, the second transaction also modifies the data in this table. This modification inserts a new row of data into the table. In the future, the user who operates the first transaction will find that there are still data rows in the table that have not been modified, just like an illusion.

Transaction isolation

To avoid the above unexpected situations during concurrent transaction processing, the standard SQL Specification defines the transaction isolation level in 4. The four isolation levels describe four different levels of concurrency control methods. The specific implementation of these concurrency control requires the use of "Lock". The following describes the locks in detail.

There are four levels of transaction isolation:

1. Read uncommitted
The lowest level of transaction isolation is read "uncommitted read", which means that the current transaction can read this change even if another concurrent transaction has not been committed. This is not safe. Read uncommitted allows dirty reads, but does not allow loss of updates. If a transaction has started to write data, the write operation of another transaction is not allowed at the same time, but other transactions are allowed to read this row of data.
The transaction isolation level at this level is unacceptable for most application systems with strict logic. The appearance of dirty reads will bring great risks to the system's concurrent logic.
2. Read committed
The literal translation is to read "committed read", which means that a transaction will not read data that has been modified but not committed by another transaction. If transaction A has updated a row of data, but transaction A has not committed (or rolled back), and transaction B wants to read the modified row of data, transaction B is required to wait until the execution of transaction a is completed (that is, commit or rollback is completed ). Read committed allows repeated reads but does not allow dirty reads.
This level of transaction isolation is the most common andIs the default isolation level for most databasesAnd also applies to most systems.
3. Repeatable read
The literal translation is "repeatable reading", which means that when the same transaction executes the same query statement successively, the results are the same. If transaction a queries a row, other transactions are not allowed to update the row before transaction a completes. Repeatable read prohibits repeated reads and dirty reads, but sometimes Phantom reads may occur. The transaction that reads data will prohibit other transactions from writing (but allow read transactions), and the transaction that writes data will prohibit any other transactions.
4. serializable
Translation is "Serial reading". The highest level of transaction isolation also provides the strictest isolation mechanism. The literal translation is "serialization", which means that this transaction cannot be executed concurrently by other transactions. Serializable provides a strict transaction isolation level, which can prevent dirty reads, non-repeated reads, and Phantom reads. It requires that the transaction be serialized and executed only one by one, but not concurrently.

Yes

Summary:

Isolation level

Update loss

Dirty read

Non-repeated read

Phantom

Description

Read uncommitted)

No

Yes

Yes

Yes

If other transactions are updated, execute the task immediately regardless of whether the transaction is submitted or not.

Read committed (Read committed default)

No

No

Yes

Yes

Read submitted data. If other transaction updates are not submitted, wait

Repeatable read)

No

No

No

Yes

Other transactions cannot be updated during query.

Serializable)

No

No

No

No

Other transactions cannot be inserted or deleted during query.

Database locks

Transactions and locks are two closely related concepts. Transactions use locks to control the concurrency of transactions described above, so as to prevent various problems arising during concurrent transaction processing. For multi-user systems, the lock mechanism is required.

Resources that can be locked include rows, pages, clusters, tables, and databases.

The lock types mainly include shared locks and exclusive locks.

Shared lock: To achieve shared read (select ).

When an object (this object can be a row of data or a data table, depending on the granularity of the lock) is added with a shared lock, this object cannot be updated (neither the locked transaction nor other transactions can be updated ). An object can be added with multiple shared locks. From the perspective of lock, when an object is added with a shared lock, other transactions are not allowed to obtain the exclusive lock.

Exclusive lock (or translated as exclusive lock):

Only one exclusive lock can be applied to an object. when an object is locked by transaction, other transactions cannot obtain the shared and exclusive locks for the locked data (that is, the exclusive locks are not compatible with the shared locks. For more information, see the lock compatibility ).

If transaction a locks an object, transaction B can only read and write the object. The exclusive locks indicate that transaction a locks an object, transaction B cannot read or write this object.

The smaller the locked object (such as locking a row), the better the concurrency, but the higher the management expense, because a lot of locks need to be maintained. Locking large objects (for example, locking the entire database at the beginning of each transaction) will greatly reduce concurrency, but the cost is relatively low, because only a relatively small number of locks need to be maintained.

Note:Some documents explain that "when one connection is written, another connection can also be written ",In fact, this is because the data read and written by each connection is not the same row, that is, the data locked by each connection is different..

Sqlserver locks

SQL Server can lock rows, pages, extended disk areas, tables, databases, and other resources.

The granularity depends on the data operation. If all rows in the table are updated, table-level locks are used. If a row in the table is updated, row-level locks are used.
Row-level locks are the optimal locks, because row-level locks cannot be wasted because data is occupied and not used. However, if your transactions frequently operate on multiple records in a table, row-level locks will be applied to many record rows in the table, and the number of locks in the database system will increase dramatically, this increases the system load and affects system performance. Therefore, SQL Server also supports lock escalation ).
The so-called lock upgrade refers to adjusting the lock granularity and replacing multiple low-granularity locks with a few higher-granularity locks to reduce the system load. In SQL Server, when there are many locks in a transaction and the lock upgrade threshold is reached, the system automatically upgrades row-level locks and page locks to table-level locks. In SQL Server, the threshold for Lock upgrade and lock upgrade are automatically determined by the system and do not need to be set by the user.

Types of SQL Server locks

Shared lock(S): used for data not changed or updated (read-only operation). For example, when a SELECT statement is used, multiple processes can hold a shared lock on the same data, however, a process cannot obtain an exclusive lock on data that already has a shared lock (unless the process requesting the exclusive lock is the process that holds the shared lock itself ). A shared lock is usually released after data is read, but it can be changed by using the query prompt or setting different transaction isolation levels.

Update lock(U): Update locks are not an independent lock, but a mixture of shared locks and exclusive locks. When sqlserver performs data modification, it first needs to search the table to find the resource to be modified, and the update lock is obtained.

The update lock is not enough for you to have the permission to change data. All modifications require you to hold the exclusive lock of the modified resource. The update lock acts like a serialization gate, push the sequence for applying for an exclusive lock in the back sequence into the queue (many processes can hold a shared lock on a resource, but only one process can hold an update lock on the resource ), if an update lock is applied to a resource, other processes cannot obtain the update lock or exclusive lock of the resource.

Update locks are not only used for update operations. sqlserver uses update locks to modify data that needs to be searched before actual modification. Such operations include limited update and deletion, it also includes insert operations on tables with clustered indexes. When sqlserver only performs the insert phase, it uses the update lock to protect data, and only after it finds the correct position and starts the insert will the update lock be converted into an exclusive lock.

Exclusive lock(X): when the data is modified by insert, update, or delete operations, sqlserver automatically obtains the exclusive lock on the data on a specific data source, only one process can obtain an exclusive lock at a time.
Intention lock: Intention locks are not an independent lock pattern. They are the qualifiers of those patterns discussed earlier. In other words, you can have an intent sharing lock, an intent exclusive lock, and an intent update lock. Since sqlserver can obtain locks at different levels of granularity, a mechanism is required to indicate that a component on a resource has been locked. For example, if a process tries to obtain a table, sqlserver needs to determine whether a row on the table has been locked.
An intent lock indicates that SQL Server wants to acquire a shared (s) lock or exclusive (x) Lock on some of the resources lower down in the hierarchy. for example, a shared intent lock placed at the table level means that a transaction intends on placing shared (s) locks on pages or rows within that table. setting an intent lock at the table level prevents another transaction from subsequently ACQ Uiring an exclusive (x) lock on the table containing that page. intent locks improve performance because SQL Server examines intent locks only at the table level to determine if a transaction can safely acquire a lock on that table. this removes the requirement to examine every row or page lock on the table to determine if a transaction can lock the entire table. Intention lock type: Intention sharing (is), intention ranking (IX), and intention ranking sharing (six)

Relationship between transaction isolation level and lock

What programmers do is mainly to set the transaction isolation level, as to when to apply the lock, what kind of lock, and when to release the lock, the DBMS automatically performs operations based on the specified transaction isolation level and specific SQL statements.

Generally, read statements (select) issue a shared lock, while write statements (Update, insert, delete) issue an exclusive lock.

At the Read committed isolation level, read operations do not obtain any locks and directly read them. Therefore, other processes that are reading or writing data are not affected. Read committed isolation only ignores the lock and does not provide data consistency.

The default isolation level of SQL Server is read committed. SQL Server's read committed transaction isolation has two variants: optimistic and pessimistic Read committed, depending on the database settings read_committed_snapshop, because the default value of the database option is disabled, the default value of this isolation level uses pessimistic concurrency control. Read committed-level transaction isolation ensures that the transaction itself never reads data that has been changed but not committed by another transaction.

Repeatable read ensures that the results of reading the database multiple times in a transaction are the same (that is, other transactions during the execution of this transaction cannot update the data that has been read by this transaction, however, other transactions can still Insert new data, so the Repeatable read-level transaction isolation may still cause phantom read problems ). To ensure transaction isolation at the Repeatable read level, all the shared locks in the transaction must be held until the transaction ends.

From the above, we can see that the lock time of the shared lock is related to the isolation level of the transaction. If the isolation level is read committed, the lock is only kept during the execution of the read (select, the lock is released after the data is queried. If the isolation level is higher than Repeatable read or serializable, the lock is released until the transaction ends. The exclusive lock is always released at the end of the transaction, so that the transaction can be rolled back when necessary. In addition, if the holdlock prompt is specified in the SELECT statement, the shared lock will not be released until the transaction ends.

Set the transaction isolation level in sqlserver

1. The value of DBCC useroptions> isolation level is the current setting value.

The scope set in this way is the current link, that is, the level of the current link you are viewing, but it is normal for SQL Server to have more than 150 links at the same time, therefore, this setting method is not powerful enough.

2. You can also set it in ADO. net.

System. Data. sqlclient. sqlconnection con = new sqlconnection ();
Con. begintransaction (isolationlevel. readuncommitted );

III:Set in SQL statements

Open the query analyzer and open two connections. Enter the following two transactions:

-- Transaction I

SET transaction isolation level read committed

Begin tran

Update a set a2 = 20 where a1 = 11

Waitfor delay '00: 00: 10'

Rollback tran

-- Transaction II

SET transaction isolation level read committed

Select * from a where a1 = 11

If you run transaction I first and then run transaction II, transaction II will wait for 10 seconds (when a connection modifies the data block, other connections cannot query the data block until it is unlocked. And vice versa: it cannot be written or modified during read ).

If transaction II is changed to the following:

SET transaction isolation level read uncommitted

Select * from a where a1 = 11

Then the transaction II does not need to wait and is executed immediately (we can see that the read uncommitted transaction select does not issue a shared lock on the data)

Selecting the transaction isolation level does not affect the lock obtained to protect data modification. The transaction always acquires the exclusive lock on any data it modifies and holds the lock before the transaction is completed.

Finally, let's take a practical example: Database concurrency consistency Case Analysis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Isolation level	Update loss	Dirty read	Non-repeated read	Phantom	Description
Read uncommitted)	No	Yes	Yes	Yes	If other transactions are updated, execute the task immediately regardless of whether the transaction is submitted or not.
Read committed (Read committed default)	No	No	Yes	Yes	Read submitted data. If other transaction updates are not submitted, wait
Repeatable read)	No	No	No	Yes	Other transactions cannot be updated during query.
Serializable)	No	No	No	No	Other transactions cannot be inserted or deleted during query.

[Study Notes] dealing with concurrent access issues at the database level

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support