Concurrent Programming (4): also talk about the database lock mechanism

Source: Internet
Author: User
Tags time 0

First, I declare that this article is based on the replies from other articles or forums. I extract the key points I think for myself to learn. After all references are attached to the document, we will not thank you here.

In the second statement, I have not studied Internel DB and used simple SQL writing. The most recent experience is SQL Performance Tuning, the specific point is to adjust and optimize SQL statements through the execution plan of Postgresql to complete database Optimization in specific scenarios. For locks, the lock mechanism supported by the database can meet the development needs at ordinary times. Because the industry is not engaged in the Internet, there is no real-time high-concurrency Application Scenario, so there is no complicated problem of quickly accessing the database; for the deadlock problem of online applications, I have never studied it. This article is a reading note for you to learn about the database lock mechanism. Thank you again for your sharing.

Why is the lock mechanism very important to the database? Let's take a look at the database concurrency and you will know why:

1. Database concurrency issues

Database concurrency problems include:

1. Update loss.

2. Unconfirmed correlation (dirty read ).

3. Inconsistent analysis (non-repeated read ).

4. Phantom reading.

The detailed description is as follows:

1. Update loss

When two or more transactions select the same row and update the row based on the originally selected value, the update will be lost. Every transaction does not know the existence of other transactions. The last update will overwrite the updates made by other firms, which will lead to data loss.

E.g. Transaction A and transaction B Modify the value of A row at the same time,

When the value of data is 2, the update made by transaction A will be lost.

Take a look at the following SQL statement:

select old_attributes  from table where primary_key = ? ---step1attributes = merge(old_attributes,new_attributes)       ----step2update table set attributes_column = attributes where primary_key = ?   ----step3 

In this case, the two threads ThreadA and ThreadB run to Step 1 to get the same old_attributes,

Then step 2 is performed at the same time, and step 3 is performed at the end of ThreadA, while step 3 is performed at ThreadB, so that ThreadB will update the attributes of ThreadA to be lost!

How can this problem be solved? There are two basic ideas: Pessimistic lock and optimistic lock. Simply put, it is assumed that such a problem is of high probability. It is best to lock it at the beginning, it is not likely that the update will always fail. Another assumption is that such a problem is of a low probability. The last step is to lock the update, so that the lock time is too long to affect other people's operations.

1.1.1 pessimistic lock

A) Traditional pessimistic lock method (not recommended ):

The above example shows that when the pop-up page for modifying the salary is initialized (in this case, it is generally queried from the database), select…… is used in this initialization query ...... For update nowait, you can add a for update nowait statement to lock this record to avoid updates from other users. This ensures that subsequent updates are updated in the correct state. Then, when the link is maintained, the system submits the update. Of course, the premise is to maintain the link, that is, to take a long time for the link, which is obviously unrealistic in the current high concurrency of the web system.

B) The current pessimistic lock method (recommended ):

When you submit a job on the "Modify salary" page, you must also lock the job (select ...... For update nowait), some people will say that it is okay to make a query here to confirm whether the record has changed. Yes, it is necessary to make a confirmation, however, if you do not add for update, you cannot guarantee that this record has not been updated by other sessions during the period from query to update submission. Therefore, you need to lock the record during query, make sure that the record is updated without any change. If there is any change, the system will notify you.

1.1.2. Optimistic lock

A) Old Value Condition (pre-image) method:

The old status value is used as the condition during SQL Update. The SQL statement is roughly as follows: Update table set col1 = newcol1value, col2 = newcol2value .... Where col1 = oldcol1value and col2 = oldcol2value ...., In the above example, we can update the current salary as a condition. If this record has been updated by other sessions, this time 0 rows are updated, here, our application system will usually prompt you to re-query for updates. Which old values are used as the condition for updating depends on the actual situation of the system. (This method may be congested. If the application uses the pessimistic lock method to lock this record for a long time elsewhere, this session will have to wait, therefore, we recommend that you use the Optimistic Locking method in a unified manner .)

B) Use the version column method (recommended ):

In fact, this method is a special mirror method, that is, you do not need to use multiple old values as conditions, you only need to add a version column to the table, this column can be a NUMBER or DATE/TIMESTAMP column, this column is used to record the version of the data. (During table design, we usually add redundant fields of the NUMBER and DATE types to each table for extension, these redundant fields can be used as version columns). In the application, we can maintain the version columns each time. During the update process, the last version is updated as a condition. When updating a row, the condition is: Primary Key + version number, and the version number of the record is updated.

The pseudocode is as follows:

start transaction;select attributes, old_version from table where primary_key = ?attribute Merge operationsupdate table set version = old_verison + 1 , attributes_column = attributes_value where primary_key = ? and version = old_versioncommit; 
After the transaction is committed, check whether the number of record updates in the last update operation is 1. If not, the service will prompt a retry. (This indicates that the update operation has a high concurrency .)

Select the pessimistic lock B method in the application system with a small number of concurrent users and severe conflicts. In other cases, the optimistic lock version column method is used first.

Specify the lock in SQL Server:

SELECT * FROM table WITH (HOLDLOCK) ---- Other transactions can read tables, but cannot update and delete SELECT * FROM table WITH (TABLOCKX) ----- other transactions cannot read, update, and delete tables.
The types of database locks vary. You need to query the api doc.

1. 2. Unconfirmed correlation (dirty read DirtyRead)

Dirty reads are generated when one transaction reads the changes that have not been committed by another transaction. E.g.

1. mary's original salary is 1000, and the financial staff changed Mary's salary to 8000 (but not submitted. mary reads her salary and finds that her salary has changed to 8000!

3. The financial team found that the operation was incorrect and rolled back the transaction. Mary's salary changed to 1000.

In this way, 8000 of Mary's salary is a dirty data.

Solution: if any other transaction cannot read its modified value before the first transaction is committed, this problem can be avoided.

. Inconsistent analysis (non-repeatable read cannot be read repeatedly)

The same query is performed multiple times in the same transaction,Modification or deletion made by other submitting Firms, Each time a different result set is returned, non-repeated read occurs. E.g.

Solution: this problem can be avoided if the data can be read only after the transaction is completely committed.

  1. 4. Phantom readingPhantom read

The same query is performed multiple times in the same transaction,Insert operations submitted by other firms, Each time a different result set is returned, a phantom read occurs. A phantom reading problem occurs when a row is inserted or deleted and the row belongs to the row being read by a transaction. The row range for the first read of the transaction shows that one row no longer exists in the second read or subsequent read because the row has been deleted by other transactions. Similarly, due to the insert operation of other transactions, the second or subsequent read of the transaction shows that a row does not exist in the original read.

E.g. Currently, 10 employees have a salary of 1000.

Solution: if no new data can be added to any other transaction before the operation Transaction Completes data processing, this problem can be avoided.

We will discuss the locking mechanism and do not want to know about the database isolation mechanism.

2. Database isolation mechanism

When talking about the database isolation mechanism, we have to first talk about transaction. Database transactions are strictly defined and must satisfy four features at the same time: Atomic, Consistency, Isolation, and Durabiliy.

Atomicity:Ensure that all or all operations in the transaction are not executed. For example, to execute a transfer transaction, either the transfer is successful or the transfer fails. If the transfer is successful, the amount is transferred from the transfer-out account to the destination account, and the amount of the two accounts changes accordingly. If the transfer fails, the amount of the two accounts remains unchanged. No money is deducted from the transfer-out account, but the target account does not receive the money.
Consistency:Ensure that the database always maintains data consistency-the transaction operation is consistent before and after the transaction operation is consistent, regardless of whether the transaction is successful or not. In the above example, the database maintains data consistency before and after the transfer.
Isolation:If multiple transactions are executed concurrently, the result should be the same as the serial execution of multiple transactions. During Concurrent Data operations, different transactions have their own data space, and their operations will not interfere with the other party. Isolate transactions that allow transaction behavior to be independent or isolated from other concurrent transactions. Through control isolation, each transaction is like the only transaction to modify the database during its operation time. The degree to which a transaction is isolated from other transactions is called the isolation level. The database specifies multiple transaction isolation levels. Different isolation levels correspond to different levels of interference. The higher the isolation level, the better the data consistency, but the weaker the concurrency.
Durability:Durability indicates that after a transaction operation is completed, the impact on the database is persistent. Even if the database is damaged due to a fault, the database should be able to recover. Generally, logs are used.

The ANSI/ISO SQL92 standard defines the isolation level for some database operations. Each isolation level specifies the type of interaction that is not allowed when the current transaction is executed, that is, whether the transactions are mutually isolated, or whether they can read or update the information used by another transaction. Higher isolation levels include restrictions imposed by lower levels.

Four isolation levels defined:

  • Read Uncommited

    Uncommitted records can be read. This isolation level is not used and ignored.

  • Read Committed (RC)

    Snapshot read is ignored.

    For the current read,RC isolation level ensures locking (record lock) of read records).

  • Repeatable Read (RR)

    Snapshot read is ignored.

    For the current read,The RR isolation level ensures that the read records are locked (record lock) and the read range is locked. New records that meet the query conditions cannot be inserted (GAP lock).

  • Serializable

    From MVCC concurrency control to lock-based concurrency control. There is no difference between the snapshot read and the current read. All the read operations are the current read, The read lock (S lock), and the write lock (X lock ).

    At the Serializable isolation level, the read/write conflicts, and the concurrency decreases sharply. Therefore, it is not recommended to use this function.

The database concurrency problems caused by different isolation levels are summarized as follows:

Isolation level
Dirty read
Non-repeated read
Phantom read
Read uncommitted RU
Yes
Yes
Yes
Read submitted RC
No
Yes
Yes
Repeatable read RR
No
No
Yes
Serializing
No
No
No
Therefore, for different isolation levels, you need to actively lock the transaction to avoid these concurrency problems.

3. database lock mechanism

The basic theories of locks used by various large databases are consistent, but their implementations vary.

SQL Server emphasizes that the system manages locks. When a user has an SQL request, the system analysis request automatically adds an appropriate lock to the database between the locking condition and the system performance. At the same time, the system often performs optimization during operation, implement Dynamic locking.

SQLite adopts extensive locks. When a connection needs to write data to the database, all other connections are locked until the write connection ends its transactions. SQLite has a lock table to help different write databases lock at the last moment to ensure maximum concurrency.

MySQL Databases have a variety of data storage engines due to their own architecture characteristics. Each storage engine has different application scenarios. To meet the needs of specific application scenarios, the locking mechanism of each storage engine is optimized for specific scenarios. Therefore, the locking mechanism of each storage engine is quite different.
In general, MySQL storage engines use three types of locking mechanisms: Row-level locking, page-level locking, and table-level locking.

For general users, the system's automatic lock management mechanism can basically meet the usage requirements. However, when writing operations are involved, you must understand the isolation mechanism and the possible problems caused by concurrency, and add the lock mechanism to transactions or SQL statements. For database deadlocks, generally, database systems have a set of mechanisms to unlock, which will not cause database paralysis, but the unlocking process will cause a rapid decline in database performance, reflecting on the program will lead to a decline in the response performance of the program, and may cause some operation failures of the program.

In actual development, we must fully consider all possible concurrency possibilities, neither adding a lock nor ensuring the correctness of data processing. Therefore, it is of great practical significance to deeply understand locks.

3.1 snapshot read VS current read

MVCC (Multi-Version Concurrency Control), the Concurrency Control protocol of multiple versions, has the biggest advantage. I believe it is also familiar: Read is not locked, and read/write does not conflict. In OLTP applications with multiple reads and writes, read/write conflicts are very important, greatly increasing the system's concurrency performance. This is why almost all RDBMS support MVCC at present.

Compared with MVCC, It is Lock-Based Concurrency Control, which is Based on Lock-Based Concurrency Control.

In MVCC concurrency control, read operations can be divided into two types: snapshot read and current read ). Snapshot reading reads the visible version of the record (which may be a previous version) without locking. The current read reads the latest version of the record, and the record returned by the current read will be locked to ensure that other transactions will not modify this record concurrently.

In a system that supports MVCC concurrency control, which read operations are snapshot reads? What operations are currently read? Take MySQL InnoDB as an example:

  • Snapshot read:A simple select operation is a snapshot read without locking. (Of course, there are exceptions, which will be analyzed below)
    • Select * from table where ?;
  • Current read:Special read operations, insert, update, and delete operations, which belong to the current read and need to be locked.
    • Select * from table where? Lock in share mode;
    • Select * from table where? For update;
    • Insert into table values (...);
    • Update table set? Where ?;
    • Delete from table where ?;

    All the preceding statements belong to the latest version of the read record. In addition, after reading, you must ensure that other concurrent transactions cannot modify the current record and lock the read record. In addition to the first statement, the S lock (shared lock) is applied to read records. In other operations, the X lock (exclusive lock) is applied ). Note: The locking of this statement is completed by the database.

3.2 lock of the current read

Why are all insert/update/delete operations classified as current read? Let's take a look at the following update operation and the execution process in the database:


, You can see the specific process of an Update operation. After the Update SQL statement is sent to MySQL, MySQL Server reads the first record that meets the condition according to the where condition. Then, the InnoDB Engine returns the first record and locks the record (current read ). After the MySQL Server receives the lock record, it will initiate an Update request to Update the record. After a record operation is completed, read the next record until no matching record exists. Therefore, the Update operation contains the current read. The same applies to Delete operations. The Insert operation is slightly different. Simply put, the Insert operation may trigger the Unique Key conflict check and perform the current read.

Note: Based on the interaction, InnoDB interacts with MySQL Server one by one for a currently read SQL statement. Therefore, locking is also performed one by one. Lock a record that meets the conditions, return it to the MySQL Server, and perform some DML operations. Then, lock the next record until the read is complete.

A traditional RDBMS Locking Principle is 2PL (Two-Phase lock): Two-Phase Locking. 2PL is relatively easy to understand. The lock operation is divided into two phases: the lock and unlock phases, and the lock and unlock phases are not at each other. Next, we will take MySQL as an example to briefly look at the implementation of 2PL in MySQL.

 

2PL divides lock/unlock into two completely different stages. Lock phase: only lock, no lock. Unlock phase: only lock, no lock.

If you want to learn more about the internal locking mechanism of the database, then please move on to the blog of the big bull: http://hedengcheng.com /? P = 771

 

References:

1. http://blog.csdn.net/mshust2006/article/details/1333361

2. http://www.poluoluo.com/jzxy/201011/98761.html

3. http://bbs.csdn.net/topics/60365964

4. http://blog.csdn.net/d8111/article/details/2595635

5. http://www.cnblogs.com/dongqingswt/archive/2013/03/28/2987367.html

6. http://database.ctocio.com.cn/#/12256926.shtml,

7. http://blog.sina.com.cn/s/blog_5b16e2790100dc75.html

Http://blog.sina.com.cn/s/blog_616b428f010163bo.html

9. http://hedengcheng.com /? P = 771 # _ Toc374698319

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.