concurrency control and locking analysis of MySQL

Last Update:2018-08-09 Source: Internet

Author: User

Tags mysql version

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper is mainly for mysql/innodb concurrency control and lock technology to do a more in-depth analysis, and related to the important concepts, such as multi-version concurrency control (MVCC), dirty Read (dirty read), Phantom Read (Phantom Read), The four isolation levels (isolation level) are described in detail, and a detailed analysis of MySQL's locking is based on a simple example. The summary of this article refers to the blog of Ho Dengcheng predecessors, and on the basis of the summary of predecessors, carried out a number of basic explanations, I hope that the students have just started a little help, if there are errors, please enlighten me. According to my writing habits, or through a few key questions to organize the logic of the text, as follows:

What is MVCC (multi-version concurrency control)? How to understand snapshot read (snapshot read) and current read?
What is the isolation level? Dirty read? Phantom reading? What is the meaning of the four isolation levels of InnoDB?
What is a deadlock?
How does the InnoDB implement MVCC?
Locking analysis of a simple SQL in different scenarios
Locking analysis of a complex SQL

Next, I will follow these key questions in order to answer the above questions, and in the process of the solution, to the details of the lock technology, elaborated more clearly.

1.1 Mvcc:multi-version Concurrent Control Multi-version concurrency controls

MVCC is a protocol designed to implement concurrency control for a database. From our intuitive understanding, to achieve the concurrency of the database access control, the simplest way is to locking access , that is, read the time can not write (allow multiple west to read simultaneously, that is, shared lock , S lock), Cannot read when writing (only one thread can write to the same piece of data at a time, that is, exclusive lock , X Lock). Such locking access, in fact, is not really a concurrency, or it can only achieve concurrent reading, because it eventually realized the read-write serialization, which greatly reduces the database read and write performance. Locking access is actually a lbcc relative to MVCC, the lock-based concurrency control (lock-based Concurrent control), which is the highest level of serialize isolation in four isolation levels. In order to propose a more superior concurrency performance method than LBCC, MVCC was born.

　　almost all RDBMS support MVCC. The best thing about it is that reading does not lock and read and write does not conflict . In MVCC, the read operation can be divided into two categories, snapshot read (Snapshot read) and current read. Snapshot reads, reads the visible version of the record (possibly a historical version, that the most recent data may be being modified concurrently by the currently executing transaction), does not lock the returned record, and the current read, reads the most recent version of the record, and locks the returned record, ensuring that other transactions do not concurrently modify the record. In MySQL InnoDB, a simple select operation, such as SELECT * from table where? All belong to the snapshot read, and belong to the current read contains the following actions:

SELECT * FROM table where? lock in Share mode; (Plus s lock)
SELECT * FROM table where? for update; (plus x lock, same as below)
Insert, UPDATE, delete operation

For a currently read SQL statement, InnoDB and MySQL server interaction is a one-piece, so locking is also a one-piece. A record that satisfies a condition is locked and returned to MySQL Server to do some DML operations, and then the next lock is read until it is finished. It should be noted that the above need to add X lock is the current read, and the normal select (except for update) are snapshot read, each insert, UPDATE, delete before the current read, will be locked, to prevent other transactions on some rows of data modification, resulting in inconsistencies in the data. In our broad sense, the phenomenon of phantom reading is solved by MVCC, which means that the transaction can return the same data set through the MVCC snapshot reading. As shown in the following:

Note that we generally say that table locks are used in MyISAM because MyISAM locks the entire table when modifying data records, while InnoDB uses row locks, which is the MVCC lock-up problem we talked about above. However, it is not the InnoDB engine that does not use a table lock, for example, when the ALTER TABLE is INNODB, the table is locked with a table lock.

1.2 Isolation LEVEL

In the SQL standard, four isolation levels are defined. Each level specifies the changes made in a transaction, which are visible within and between transactions, and which are invisible. Low-level isolation can perform a higher level of concurrency with good performance, but dirty reads and phantom reads occur. First, we start with two basic concepts:

　　Dirty Read (dirty Read): Two transactions, one transaction read to another transaction uncommitted data, this is dirty read.

　　Phantom Read (Phantom Read): Two transactions, transaction A and transaction B, transaction A in its own execution of the process, the execution of the same query two times, the first query transaction B is not committed, the second query transaction B has been committed, resulting in two query results are not the same, this is actually called Non-repeatable read ; if transaction B is an insert operation that affects the results of the query, it is as if the new line is more like an illusion, so it is called a phantom read. The submission of other transactions affects duplicate query results in the same transaction.

Here is a brief description of the four standard isolation levels defined in SQL:

Read uncommitted (UNCOMMITTED) : Isolation level: 0. The uncommitted records can be read. Dirty reads will occur.
Read COMMITTED (committed) : Isolation level: 1. Only the committed changes are visible in the transaction. Do not repeat read, there will be phantom read. (in InnoDB, a row is added, but no gap lock is added) The isolation level is the default isolation level for most database systems, but MySQL is RR.
REPEATABLE READ (repeatable Read) : Isolation level: 2. This is true in InnoDB : The RR Isolation level guarantees locking (record locks) on the read record, while guaranteeing that the read range is locked, the new record that satisfies the query condition cannot be inserted (GAP Lock), so there is no phantom read. However, a standard RR can only guarantee that the results of reading the same record multiple times in the same transaction are consistent and cannot solve the phantom read problem. InnoDB's Magic reading solution is based on the MVCC implementation mechanism.
SERIALIZABLE (SERIALIZABLE): Isolation Level: 3. This isolation level adds a lock to each row of data read, and is degraded to lock-based concurrency control, LBCC.

It is important to note that MVCC only works under RC and RR two isolation levels, and the other two isolation levels are incompatible with MVCC.

1.3 Dead Lock

A deadlock is a phenomenon in which two or more transactions interact on the same resource and request a lock on the resources occupied by each other, leading to a vicious cycle. Deadlocks can occur when multiple transactions attempt to lock resources in different order . Deadlocks can also occur when multiple transactions lock the same resource at the same time. And look at the following two examples of the resulting deadlock:

The first deadlock is well understood, and the second deadlock, due to the fact that the main index (clustered index table) is still a different order of two records in the lock, it will still cause a deadlock. Whether or not a deadlock occurs is not the number of SQL statements in a transaction, the key to a deadlock is that the order of two (or more) session locks is inconsistent . Therefore, by analyzing the lock details, we can determine if the SQL we write will deadlock, and what we should do when the deadlock occurs.

MVCC realization mechanism of 1.4 InnoDB

MVCC can be thought of as a variant of a row-level lock, which in many cases avoids lock-up operations and therefore costs less. Most implementations of MVCC implement non-blocking read operations, and write operations lock only the necessary rows. The MVCC implementation of INNODB is achieved by saving the snapshot of the data at a certain point in time. a transaction, regardless of how long it executes, the data it sees inside is consistent . That is, transactions do not affect each other during execution. Let's briefly describe the implementation of MVCC in InnoDB.

InnoDB's MVCC, implemented by saving two hidden columns after each row of records: A saved row's creation time, a save line's expiration time (delete time), of course, the time here is not a timestamp, but the system version number, each start a new transaction, the system version number will be incremented . Under the RR isolation level, the MVCC operation is as follows:

Select operation. A. InnoDB only looks for data rows that are earlier than (contain equals) The current transactional version . You can ensure that a transaction reads a row, either before the transaction starts, or as a record of the transaction itself being inserted or modified. B. the deleted version of the row is either undefined or larger than the current transaction version number . You can ensure that a transaction reads rows that are not deleted before the transaction begins.
Insert operation. Saves the newly inserted row with the current version number as the line version number.
Delete operation. The deleted row is saved with the current version number as the delete identity.
Update operation. becomes a combination of insert and delete operations, the Insert row holds the current version number as the line version number, and delete saves the current version number to the original line as the delete identity.

Because the old data is not really deleted, so the data must be cleaned up, InnoDB will open a background thread to perform cleanup work, the rule is to delete the deletion version number is less than the current system version of the row delete , this process is called purge.

1.5 Locking analysis of a simple SQL

In MySQL's InnoDB, it is based on the clustered index table. And the normal select operation is based on the snapshot read, does not need to lock. So how do we analyze the lock details when we analyze other SQL statements? Let's take a simple delete operation of SQL as an example to make a detailed elaboration. And look at the following sql:

Delete from T1 where id=10;

If this SQL is locked analysis, then how is MySQL locked? In general, our intuitive feeling is that we will lock the id=10 record . However, this rash conclusion is one-sided, we need to know more conditions to determine MySQL lock-up situation. What other conditions do you need to know? Like what:

Is the ID column a primary key?
What is the isolation level of the system?
If the ID is not a primary key, does it have an index on it?
Is the index created a unique index?
What is the execution plan for this SQL? Index scan? Full table scan?

Next, I'll combine the answers to these questions and then analyze each combination in an easy-to-difficult order, with the corresponding SQL lock.

Combination 1: ID column is primary key, RC isolation Level
Combination 2: ID column is a level two unique index, RC isolation Level
Combination 3: ID column is a level two non-unique index, RC isolation Level
Combination 4: No index on the ID column, RC isolation level
Combination 5: ID column is primary key, RR isolation level
Combination 6: ID column is a level two unique index, RR isolation level
Combination 7: ID column is a level two non-unique index, RR isolation level
Combination 8: No index on ID column, RR isolation level
Combination 9: Serializable isolation Level

Combined 1:id column is primary key, RC isolation Level

When the ID is the primary key, we only need to add an X lock on the id=10 record. As shown in the following:

The combined 2:id column is a level two unique index, the RC isolation level

Here I'll explain the difference between a clustered index and a normal index. In InnoDB, the primary key can be understood as the clustered index, the leaf node in the clustered index is the corresponding data row, the table with the clustered index is also called the Clustered Index table, and the data is stored in the order of the primary key when it is stored. We all know that when the database is in Select, the index column is selected for the lookup, the index columns are stored according to the B + tree (multi-fork search tree) data structure, find the primary key, and then return to the clustered Index table for query, which is called back Table query. Then we will naturally ask, when using the index to query, the index corresponding to the record will be locked? That's going to happen . If the ID is a unique index, the x lock is only recorded for the index corresponding to the unique index, and if the ID is a non-unique index, all the corresponding index records are on the X lock. As shown in the following:

The combined 3:id column is a level two non-unique index, the RC isolation level

Explain ibid., e.g.:

There is no index on the combined 4:id column, rc isolation level

Because there is no index on the ID column, you can only go through the clustered index and scan it all. Some say the x lock will be added to the table; some say it will be on the clustered index, the selected ID = 10 record plus the X lock. The real situation is as follows:

If there is no index on the ID column, SQL will filter through the full scan of the clustered index, as filtering is done at the MySQL server level. Therefore, each record, whether or not it satisfies the condition, is added with an X lock. However, for the sake of efficiency, MySQL is optimized, for the record that does not satisfy the condition, will put the lock after judgment, finally hold the lock on the record that satisfies the condition, but the locking/release lock action on the record that does not satisfy the condition will not be omitted. At the same time, optimization also violates the 2PL constraint (while locking at the same time lock).

Combination of 5,6 and above (because only one result record, only on the above lock)

The combined 7:id column is a level two non-unique index, RR isolation level

At the RR isolation level, gap locks are used to prevent phantom reads from occurring. Here, you can interpret the gap lock as not to allow data to be inserted in front of the data record. First, through the ID index to locate the first to satisfy the query criteria record, add record x lock, add gap on Gap Lock, and then add the primary key clustered index record x lock, then return; then read the next one, repeat. Until the first record that does not meet the criteria [11,f] is reached, the X lock is not required, but the gap lock is still required, and the end is returned. As shown in the following:

Combined 8:id Column No index, RR isolation level

In this case, all records on the clustered index are added with an X lock. Second, the gap (GAP) between each record of the cluster index is also added to the gap lock. Such as:

However, MySQL is doing a related optimization, which is called Semi-consistent read. Semi-consistent read is on, MySQL will lock the record in advance and will not add a gap lock for records that do not meet the query criteria.

Combining 9:serializable Isolation Levels

Same as the RR isolation level.

1.6 Locking analysis of a complex SQL

Here we are just a conclusion, because to relate to the MySQL where query condition analysis, so here do not do a detailed introduction, I will be detailed in the following blog. Such as:

Conclusion: At the RR isolation level, for a complex SQL, you first need to extract its where condition. Index key to determine the scope, need to add gap lock ; Index filter filter, depending on whether the MySQL version supports ICP, if the ICP is supported, the Index filter does not meet the record, no X lock, otherwise need x lock; Table Filter filters, whether or not satisfied, require an X lock. The result of the lock is as follows:

concurrency control and locking analysis of MySQL

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More