Concurrent Programming (iv): Also talk about the database lock mechanism

Concurrent Programming (iv): Also talk about the database lock mechanism _ concurrent programming

Last Update:2018-08-22 Source: Internet

Author: User

Tags sessions sqlite

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First of all, this article is basically from other people's articles or forums of the response to the reorganization. I took what I thought was the key point for myself to learn. All the references are attached to the text, here is not a table to thank.

The second statement, I have not studied Internel db, is also used to write a simple SQL, up to now the most experience is the performance of SQL tuning, the specific point is through the PostgreSQL implementation plan, To adjust the optimization SQL statement to complete the database tuning in a particular scenario. For locks, the locking mechanism supported by the database has been able to meet the normal development needs. Because the industry is not the Internet, there is no real-time high concurrency of the application scenario, so there is no speed to the complex problem of the database, for the application of the deadlock problem, it is not studied. This article is a study of the database lock mechanism of a reading note. Thank you again for your colleagues ' sharing.

Lock mechanism Why is the database very important content, then look at the database concurrency problem you know why: 1. Problems with database concurrency

Concurrency problems associated with the database include:

1. Missing updates.

2. Unconfirmed dependencies (dirty reads).

3. Inconsistent analysis (non repetitive reading).

4. Phantom Reading.

Detailed description as follows: 1.1. Missing updates

A loss update problem occurs when two or more transactions select the same row and then update the row based on the value originally selected. Every transaction has no knowledge of the existence of other transactions. The final update overrides updates made by other transactions, which results in data loss.

e.g. transaction A and transaction B modify the values of a row at the same time, transaction a changes the value to 1 and submits transaction B to change the value to 2 and submit.

When the value of the data is 2, the update made by transaction a will be lost.

Look at the following section of SQL:

[SQL] View plain copy select Old_attributes from table where Primary_key =? ---step1 attributes = merge (old_attributes,new_attributes)----STEP2 Update table Set attributes_column = attrib utes where Primary_key =? ----STEP3
But in this case, there is a problem with missing updates, and two threads Threada and threadb run at the same time to Step1 get the same old_attributes,

Then do the STEP2 at the same time, the last Threada to do the Step3, and threadb after the step3, so threadb the Threada of the property update to be lost.

How to solve it. Basically two kinds of ideas, one is the pessimistic lock, the other is the optimistic lock; Simply put, it is assumed that such a problem is high probability, it is better to lock it in the beginning lest the update always fail; another is to assume that such a problem is a small probability, and then lock it when the last step is updated. Lest it be locked for too long to affect others to do the relevant operation.

1.1.1 Pessimistic lock

A traditional pessimistic locking method (not recommended):

As the above example shows, in this initialization query, select ... for update nowait, lock the record by adding a for UPDATE nowait statement, when the page in which the payroll is popped is initialized (in which case it is usually queried from the database). Avoid updates from other users to ensure that subsequent updates are updated in the correct state. Then in the state of keeping this link, make an update submission. Of course, the premise is to keep the link, is to take a long time for the link, this is now the web system high concurrent high frequency is obviously unrealistic.

(b) Now the pessimistic locking method (recommended first Use):

In the revision of salary this page is submitted to the first query, of course, this query must also be locked (SELECT ... for update nowait), some people would say, here to make a query to confirm that the record has changed not on the line, yes, is to make a confirmation, but you do not add for Update will not guarantee you in the query to update the time of submission this record has not been updated by other sessions, so this way also need to lock records in the query, to ensure that the record does not change on the basis of the update, if any changes are prompted to inform the user. 1.1.2. Optimistic lock

A) old value condition (front mirror) method:

is to use the old State value to make the condition when the SQL update, SQL is roughly the following update table set col1 = newcol1value, col2 = Newcol2value .... where col1 = Oldcol1value and col2 = Oldcol2value .... , in the above example we can update the current salary as a condition, if this record has been updated by other sessions, then the update of 0 lines, where our application system will usually make a prompt to notify users to requery updates. This takes what old values as conditional updates depending on the actual system situation. (This approach is likely to be blocked, if the application elsewhere using the pessimistic lock to lock the record for a long time, then this session will need to wait, so use this approach is best to unify the use of optimistic locking method.) ）

(b) Use of the Version column method (recommended for priority use):

In fact, this approach is a specialized front mirror method, that is, do not need to use multiple old values to do the condition, just add a version of the table column, this column can be a number or date/timestamp column, The addition of this column is used to record the version of this data (in the table design we usually add some number and date redundant fields for each table, for extended use, these redundant fields can be used as version columns, and in the application we maintain the version column every time we operate. We update the last version as a condition at the time of the update. When you update a row, restrict the condition = PRIMARY key + version number, and update the version number of the record.

Pseudo code is as follows:
[SQL] View plain copy start transaction; Select attributes, old_version from table where Primary_key =? Attribute Merge Operations Update table Set Version = Old_verison + 1, attributes_column = Attributes_value where prima Ry_key =? and version = Old_version commit; After the transaction commits, see if the record number for the last update operation is 1, and if not, prompt for a retry on the business. (indicates that the update operation has a higher degree of concurrency at this time.) ）

The pessimistic lock B method is chosen in the application system with less user concurrency and more serious conflicts, and the other cases are optimistic locking version first.

Lock specified in SQL Server:

[SQL] View plain copy select * from table with (HOLDLOCK)----Other transactions can read the table, but cannot update the delete SELECT * from table with (TABLOCKX) -----Other transactions can not read the table, update and delete different database lock types differ, the specific need to query the respective API doc.

1.2. Unconfirmed dependencies (Dirty read dirtyread)

Dirty reads are generated when one transaction reads a modification that another transaction has not committed. e.g.

1.Mary of the original salary of 1000, the financial staff to change Mary's wages to 8000 (but uncommitted) 2.Mary read their wages, found that their wages into 8000, rapturous.

3. While the financial discovery was wrong and the transaction was rolled back, Mary's wages became 1000.

Like this, Mary remembered that the salary of 8000 is a dirty data.

Workaround: If any other transaction is not able to read its modified value before the first transaction commits, you can avoid the problem. 1.3. Inconsistent analysis (non-repeatable reading non-repeatable Read)

The same query takes place more than once in the same transaction, and because of changes or deletions made by other submitting transactions, a different result set is returned each time, and a non duplicate read occurs. e.g. in transaction 1, Mary read her own salary of 1000, the operation did not complete in transaction 2, when the financial staff changed Mary's salary of 2000, and submitted a transaction. In transaction 1, when Mary reads her salary again, her salary becomes 2000.

Workaround: This problem can be avoided if the data can be read only after the modification of the transaction is fully committed. 1.4. Phantom Reading Phantom Read

The same query takes place more than once in the same transaction, and a phantom read occurs each time a different result set is returned due to an insert operation done by another submission transaction. A phantom read problem occurs when an INSERT or delete operation is performed on a row that belongs to the range of rows being read by a transaction. The row range for the first read of the transaction shows that one row has ceased to exist in the second or subsequent reads because the row was deleted by another transaction. Similarly, because of the insert operation of another transaction, the second or subsequent read of the transaction shows that a row does not exist in the original read.

e.g. at present, there are 10 employees with a salary of 1000. Transaction 1, read all employees with a salary of 1000. At this point, transaction 2 inserts an employee record into the employee table with a salary of 1000. Transaction 1 read again all employees with a payroll of 1000 read 11 records.

Workaround: If no other transaction can add new data until the operation transaction completes data processing, you can avoid the problem

Discuss the lock mechanism, and do not understand the database isolation mechanism. 2. Database isolation mechanism

When it comes to the database isolation mechanism, we have to first say the transaction of the transaction. Database transactions are strictly defined and must satisfy 4 characteristics: atomicity (Atomic), consistency (consistency), isolation (isolation), and persistence (Durabiliy), for the short of acid.

Atomicity: Guarantees that all operations in a transaction are performed or all are not performed. For example, to carry out a transfer transaction, either transfer successfully or fail. Succeeds, the amount is transferred from the outbound account to the destination account, and the amount of the two accounts changes accordingly; the amount of the two accounts is unchanged. There will be no transfer out of the account to deduct the money, while the destination account does not receive the money situation.
Consistency: Ensure that the database always maintains data consistency-the transaction is consistent before the transaction operation, regardless of the success or failure of the transaction. As in the above example, the database maintains consistency of data both before and after the transfer.
Isolation: When multiple transactions are executed concurrently, the result should be the same as the serial execution effect of multiple transactions. In concurrent data operations, different transactions have their own data space, and their operations do not interfere with each other. Isolation allows transactional behavior to be independent or isolated from other concurrently running transactions. By controlling isolation, each transaction is like the only transaction that modifies the database in its action time. The degree to which a transaction is isolated from other transactions is called the isolation level. The database provides a variety of transaction isolation levels, with different isolation levels corresponding to different levels of interference, the higher the isolation level, the better the data consistency, but the weaker the concurrency.
Persistence: persistence means that when things are done, the impact on the database is persistent, and the database should be able to recover even if the database is compromised by a failure. The usual way to implement this is to use logs.
The Ansi/iso SQL92 standard defines the level of isolation for some database operations. Each isolation level specifies the type of interaction that is not allowed when the current transaction is executing, that is, whether the transactions are isolated from each other, or whether they can read or update information that is used by another firm. The higher isolation level includes the restrictions imposed by the lower level.

4 types of isolation levels defined: Read uncommited

You can read UNCOMMITTED records. This isolation level is not used, ignored.
Read committed (RC)

Snapshot read ignored, this article is not considered.

For the current read, the RC isolation level guarantees that the recorded records are lock (record locks), and there is a phantom reading phenomenon.
Repeatable Read (RR)

Snapshot read ignored, this article is not considered.

For the current read, the RR isolation level ensures that the recorded records are lock (record locks), while the range of reads is locked, and new records that meet the query criteria cannot be inserted (gap locks).
Serializable

From MVCC concurrency control to concurrency control based on lock. Does not distinguish between snapshot read and current read, all reads are current read, read plus read lock (s lock), write write lock (x lock).

Serializable the isolation level, read-write conflicts, so the concurrency is drastically reduced, so it is not recommended.

The problems associated with the database concurrency that will result from different isolation levels are summarized as follows:

Isolation level
Dirty Read
Do not read repeatedly
Phantom reading
READ UNCOMMITTED RU
Yes
Yes
Yes
Read submitted RC
No
Yes
Yes
REPEATABLE READ RR
No
No
Yes
Serialization
No
No
No
Therefore, for different isolation levels, it is necessary to actively lock the transaction in order to avoid these concurrency problems.

3. Lock mechanism of the database

The basic theory of lock used in various large databases is consistent, but it is different in concrete implementation.

SQL Server is more focused on managing locks by the system. When the user has the SQL request, the system analyzes the request, automatically satisfies the locking condition and the system performance to add the appropriate lock to the database, simultaneously the systems often automatically optimizes the processing during the operation, implements the dynamic lock.

The SQLite adopts the extensive type lock. When a connection is to be written to the database, all other connections are locked until the write connection ends its transaction. SQLite has a lock table to help different write databases to be locked at the last minute to ensure maximum concurrency.

MySQL database because of its own architecture characteristics, there are a variety of data storage engine, each storage engine for the application of different features are not the same, in order to meet the needs of their specific application scenarios, each storage engine locking mechanism for each of the specific scenarios faced by the optimization design, So the locking mechanism of each storage engine also has a big difference.
Overall, the MySQL storage engine uses three types (level) locking mechanisms: row-level locking, page-level locking, and table-level locking.

For the general users, through the system of automatic locking management mechanism can basically meet the requirements of the use. However, when it comes to write operations, it is important to understand the isolation mechanism and the possible problems associated with concurrency, and to add a lock mechanism to the transaction or SQL. For the database deadlock, the general database system will have a set of mechanisms to unlock, the general will not cause the database paralysis, but the process of unlocking the database will cause a rapid decline in performance, reflected in the program will cause the performance of the program's decline, and will cause the operation of the failure of the program.

In the actual development, we should fully consider all possible concurrent possible, can not add the role of the lock, but also to ensure the correctness of data processing. Therefore, a deep understanding of the lock has very important practical significance. 3.1 Snapshot read vs current Read

Multi-version Concurrency Control Protocol--MVCC (multi-version concurrency) The biggest benefit, believe is also familiar: read without lock, reading and writing does not conflict. In an OLTP application with less read and write, it is very important to read and write without conflict, which greatly increases the concurrency performance of the system, which is why almost all RDBMS support MVCC at this stage.

In contrast to MVCC, it is based on the concurrency control of the lock, lock-based concurrency.

In MVCC concurrency control, read operations can be divided into two categories: snapshot read (snapshot read) and current read. Snapshot read, read the visible version of the record (possibly historical version), without locking. The current read reads the latest version of the record, and the record returned by the current read will be locked to ensure that other transactions do not modify the record concurrently.

In a system that supports MVCC concurrency control, which read operations are snapshot read. Which operations are currently read. Take MySQL InnoDB as an example:
Snapshot read: Simple select operation, belongs to snapshot read, without lock. (Of course, there are exceptions, which are analyzed below)
SELECT * from table where?;
Current read: Special read operation, insert/update/delete operation, belong to current read, need to lock.
SELECT * FROM table where? lock in Share mode;
SELECT * FROM table where? for update;
Insert into table values (...);
Update table set? Where?;
Delete from table where?;

All of the above statements belong to the current read and read the latest version of the record. Also, after reading, it is necessary to ensure that other concurrent transactions cannot modify the current record and lock the read record. Among them, except the first statement, the read record plus S lock (shared lock), the other operations, all add an X lock (exclusive lock). Note: The lock of this statement is done by the database.

3.2 Currently read lock

Why the Insert/update/delete operation is attributed to the current read. Take a look at the following update operation, the execution process in the database:

From the diagram, you can see the specific process of an update operation. When update SQL is sent to MySQL, the MySQL server reads the first record that satisfies the condition according to the Where condition, and then the InnoDB engine returns the first record and locks (current read). After the MySQL server receives this locked record, it will initiate an update request that updates the record. A record operation completes, and then the next record is read until no record of the condition is met. Therefore, within the update operation, a current read is included. Similarly, the delete operation is the same. The insert operation is slightly different, in simple terms, where the insert operation may trigger a conflict check for the unique key and a current read.

Note: The interaction between InnoDB and MySQL server for a currently read SQL statement is based on the interaction of the above diagram, so the lock is made in one piece. Lock a record that satisfies the condition, return it to the MySQL Server, do some DML operations, and then read the next lock until the read is complete. One of the principles of traditional RDBMS plus locks is 2PL (two-phase Lock): two-phase locking. Relatively speaking, 2PL is easier to understand, said that the lock operation is divided into two stages: the lock phase and the unlock phase, and ensure that the lock phase and the unlock phase does not intersect. Below, still take MySQL as an example, to a simple look at 2PL in MySQL implementation.

As can be seen from the above figure, 2PL is the lock/unlock into two completely disjoint phase. Lock stage: Lock only, do not put lock. Unlock phase: Lock only, without lock. If you want to learn more about the internal locking mechanism of the database, then please Ho Deng into Daniel's blog: http://hedengcheng.com/?p=771

Resources:

1. http://blog.csdn.NET/mshust2006/article/details/1333361

2. http://www.poluoluo.com/jzxy/201011/98761.html

3. http://bbs.csdn.Net/topics/60365964

4. http://blog.csdn.net/d8111/article/details/2595635

5. http://www.cnblogs.com/dongqingswt/archive/2013/03/28/2987367.html

6. Http://database.ctocio.com.cn/426/12256926.shtml,

7. http://blog.sina.com.cn/s/blog_5b16e2790100dc75.html

8. http://blog.sina.com.cn/s/blog_616b428f010163bo.html

9. http://hedengcheng.com/?p=771#_Toc374698319

from:http://blog.csdn.net/anzhsoft/article/details/19132909

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More