The relationship between the transaction isolation level and the lock in Innodb

Source: Internet
Author: User
Tags mutex mysql in

Transferred from: https://tech.meituan.com/innodb-lock.html

Objective:

We all know the nature of transactions, the database in order to maintain these properties, especially consistency and isolation, the general use of lock this way. At the same time, the database is a high concurrency application, there will be a lot of concurrent access, if the lock over, will greatly reduce the concurrency processing power. So for the lock processing, it can be said that the database is the essence of transaction processing. Here, by analyzing the locking mechanism of the InnoDB engine in MySQL, we can give the reader a better understanding of what the database does in the transaction processing.

#一次封锁or两段锁?
Because there is a lot of concurrent access, in order to prevent deadlocks, the general application is recommended to use a blocking method, that is, at the beginning of the method, you already know what data will be used, and then all locked, after the method runs, and then all unlocked. This approach effectively avoids cyclic deadlocks, but is not applicable in the database because the database does not know what data will be used at the beginning of the transaction.
The database follows the two-phase lock protocol, dividing the transaction into two stages, locking the stage and unlocking the stage (so called two-segment lock)

    • Lock phase: The lock operation can be performed at this stage. Before reading any data, request and obtain S lock (shared lock, other transactions can continue to share the lock, but not the lock), before the write operation to request and obtain an X lock (exclusive lock, other transactions can no longer acquire any locks). The lock does not succeed, the transaction goes into a wait state until locking succeeds to continue execution.
    • Unlocking phase: When a transaction releases a block, the transaction enters the unlock phase, and the lock operation is no longer allowed at this stage.
Transactions locking/unlock handling
Begin
INSERT into Test ..... Add insert corresponding to the lock
Update Test Set ... Add update corresponding to the lock
Delete from Test .... Add delete corresponding to the lock
Commit When a transaction commits, the lock corresponding to the INSERT, update, and delete are released simultaneously

While this approach does not prevent deadlocks, the two-phase lock protocol ensures that concurrent scheduling of transactions is serializable (serialization is important, especially in data recovery and backup).

#事务中的加锁方式

# #事务的四种隔离级别
In the database operation, in order to effectively guarantee the correctness of the concurrent read data, the transaction isolation level is proposed. Our database locks are also available to build these isolation levels.

Isolation Level Dirty Reads (Dirty read) non-repeatable read (nonrepeatable Read) Phantom Read (Phantom Read)
Uncommitted read (READ UNCOMMITTED) possible possible possible
Read Committed No way possible possible
REPEATABLE READ (Repeatable Read) No way No way possible
Serializable (Serializable) No way No way No way
    • Uncommitted read (READ UNCOMMITTED): Allows dirty reads, which may read data that has not been committed in other sessions for transaction modification
    • Read Committed: Read only the data that has been committed. Most databases, such as Oracle, are at this level by default (not repeating)
    • Repeatable read (repeated read): Repeatable read. Queries within the same transaction are all consistent at the beginning of the transaction, InnoDB the default level. In the SQL standard, this isolation level eliminates non-repeatable reads, but there are also phantom reads
    • Serial Read (Serializable): Fully serialized read, each read requires a table-level shared lock, read and write each other will block

Read uncommitted This level, the database is generally not used, and any operation is not locked, this is not discussed here.

# #MySQL中锁的种类
There are many kinds of locks in MySQL, there are common table locks and row locks, there are new metadata lock and so on, the table lock is a full table lock, although can be divided into read lock and write lock, but after all, is locked the entire table, will cause the concurrency ability to decline, usually do DDL processing use.

Row locks are locked data rows, this locking method is more complex, but because only the limited data lock, for other data unrestricted, so the concurrency is strong, MySQL is generally used row lock to handle concurrent transactions. The main discussion here is the row lock.

# # #Read Committed (read submissions)
At the RC level, data reads are unlocked, but data writes, modifies, and deletes are required to be locked. The effect is as follows

MySQL> show create table class_teacher \GTable: class_teacherCreate Table: CREATE TABLE `class_teacher` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `class_name` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,  `teacher_id` int(11) NOT NULL,  PRIMARY KEY (`id`),  KEY `idx_teacher_id` (`teacher_id`)) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci1 row in set (0.02 sec)MySQL> select * from class_teacher;+----+--------------+------------+| id | class_name   | teacher_id |+----+--------------+------------+|  1 | 初三一班     |          1 ||  3 | 初二一班     |          2 ||  4 | 初二二班     |          2 |+----+--------------+------------+

Since MySQL's innodb default is the RR level used, we'll first turn the session into an RC level and set the Binlog mode

SET session transaction isolation level read committed;SET SESSION binlog_format = ‘ROW‘;(或者是MIXED)
Transaction A Transaction B
Begin Begin
Update Class_teacher set Class_name= ' third day second class ' where teacher_id=1; Update Class_teacher set Class_name= ' At the beginning of Class 33 ' where teacher_id=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; Try restarting transaction
Commit

In order to prevent modification conflicts in the concurrency process, MySQL in transaction a locks the data row of teacher_id=1 and does not commit (releasing the lock), then transaction B will never get the row lock, wait until timeout.

At this point we have to notice that teacher_id is indexed, and if it is a class_name without an index? Update Class_teacher set teacher_id=3 where class_name = ' third day class ';
Then MySQL will add a row lock to all rows of data in the entire table. It sounds a bit strange here, but when SQL is running, MySQL doesn't know which rows are class_name = ' third day ' (no index), and if a condition can't be quickly filtered through the index, the storage engine plane will lock all the records back, and then the MySQL Server layer for filtering.

However, in the actual use of the process, MySQL made some improvements, in the MySQL server filter conditions, found not satisfied, will call the Unlock_row method, the non-satisfied record release lock (violates the two-segment lock protocol constraints). This ensures that only the locks that satisfy the condition record will be held at the end, but the lock operation of each record cannot be omitted. It can be seen that even MySQL, in order to be efficient, will violate the specification. (See high-performance MySQL, third edition of p181)

This also applies to MySQL's default isolation level RR. So for a large number of tables to do batch modification, if the corresponding index can not be used, MySQL Server filter data when the time is particularly slow, it will appear that although some rows of data is not modified, but they are still locked in the phenomenon.

# # #Repeatable Read (can be reread)
This is the default isolation level for InnoDB in MySQL. Let's take a "read" and "write" two modules to explain.

# # #读
Reading is rereading, and the concept of being stressed is that multiple instances of a transaction will see the same data row when reading the data concurrently, a bit abstract, let's look at the effect.

Display in RC (non-stressed) mode

Transaction A Transaction B
Begin

Begin

Select id,class_name,teacher_id from Class_teacher where teacher_id=1;

ID class_name teacher_id
1 Third Day Second class 1
2 Third Day Class One 1

Update Class_teacher set Class_name= ' At the beginning of Class 33 ' where id=1;

Commit

Select id,class_name,teacher_id from Class_teacher where teacher_id=1;

ID class_name teacher_id
1 Third Day Class three 1
2 Third Day Class One 1

Read the data modified by transaction B, and the result of the first query is not the same, it is not reread.

Commit


After transaction B modifies the data submitted by id=1, transaction A has the same query, and the result is different from the previous one, which is not reread (the result of re-reading is not the same). This is likely to bring some problems, so let's look at the performance of MySQL in the RR level:

Transaction A Transaction B Transaction C
Begin

Begin

Begin

Select id,class_name,teacher_id from Class_teacher where teacher_id=1;

ID class_name teacher_id
1 Third Day Second class 1
2 Third Day Class One 1

Update Class_teacher set Class_name= ' At the beginning of Class 33 ' where id=1;

Commit

INSERT into class_teacher values (null, ' first 33 shifts ', 1);

Commit

Select id,class_name,teacher_id from Class_teacher where teacher_id=1;

ID class_name teacher_id
1 Third Day Second class 1
2 Third Day Class One 1

The data modified by transaction B is not read, and is repeatable read as the first SQL read.

No new data added to transaction C was read.

Commit

We note that when teacher_id=1, transaction a reads first, the id=1 data is modified in the middle of transaction B, and after commit, the second time that transaction a reads the data is exactly the same as the first. So it is re-emphasized. So how did MySQL do it? Let's sell a xiaoguanzi here, and we'll look down.

# # # #不可重复读和幻读的区别 # #
A lot of people are prone to be confused and unreadable, and indeed they are somewhat similar. But non-repeatable reading focuses on update and delete, and the focus of Phantom reading is on insert.

If the lock mechanism is used to implement these two isolation levels, in repeatable reads, the SQL is locked for the first time after it is read to the data, and other transactions cannot modify the data so that repeatable reads can be achieved. However, this method cannot lock the insert data, so when transaction a previously read the data, or modify all the data, transaction B can still insert the data to commit, then transaction a will find that there is no more than one of the previous data, this is the Phantom read, can not be avoided by row lock. Need to serializable isolation level, read lock, write with write lock, read lock and write lock mutex, this can effectively avoid phantom reading, non-repeatable read, dirty reading problems, but will greatly reduce the concurrency of the database.

Therefore, the most important difference between non-repeatable reading and phantom reading is how to solve their problems by locking mechanism.

As mentioned above, the use of pessimistic locking mechanism to deal with these two problems, but MySQL, ORACLE, PostgreSQL and other mature databases, for performance reasons, are using optimistic locking theory based on the MVCC (multi-version concurrency control) to avoid these two problems.

# # # #悲观锁和乐观锁 # #

    • Pessimistic lock

As its name implies, it refers to the conservative attitude of data being modified by the outside world (including other transactions currently in the system, as well as transactions from external systems), so that data is locked during the entire data processing process. Pessimistic lock implementation, often rely on the database provided by the lock mechanism (also only the database layer provides a lock mechanism to truly guarantee the exclusivity of data access, otherwise, even in this system to implement the locking mechanism, there is no guarantee that the external system will not modify the data).

In the case of pessimistic locking, in order to ensure the isolation of the transaction, the consistency lock read is required. Lock when reading data, other transactions cannot modify this data. Locks are also added when modifying deleted data, and other transactions cannot read the data.

    • Optimistic lock

Relative pessimistic lock, the optimistic locking mechanism adopts a more relaxed locking mechanism. Pessimistic locking relies on the lock mechanism of the database in most cases, to ensure the maximum degree of exclusivity of the operation. But it comes with a lot of overhead for database performance, especially for long transactions, which are often unsustainable.

And the optimistic locking mechanism solves this problem to some extent. Optimistic locking, mostly based on the data version (versions) recording mechanism implementation. What is a data version? is to add a version identity to the data, which is typically done by adding a "version" field to the database table in the version solution based on the database table. When the data is read, the version number is read together, and then the version number is added one after the update. At this point, the version data of the submitted data is compared to the current version information of the database table corresponding to the record, and if the submitted version number is greater than the current version number of the database table, it is updated, otherwise it is considered to be outdated data.

To illustrate, MVCC implementations do not have a fixed specification, each database will have a different way of implementation, here is the discussion of InnoDB MVCC.

# # #MVCC在MySQL的InnoDB中的实现
In InnoDB, two additional hidden values are added after each row of data to implement MVCC, which is a record of when this row of data was created and when another record of when this row of data expires (or is deleted). In practice, it is not the time that is stored, but the version number of the transaction, and the version number of the transaction is incremented for each new transaction that is opened. Under the re-repeatable reads transaction ISOLATION level:

    • Select, read create version number <= current transaction version number, delete version number is empty or > Current transaction version number.
    • When you insert, save the current transaction version number to the created version number of the row
    • Delete, save the deleted version number of the row for the current transaction version number
    • UPDATE, insert a new record, save the current transaction version number for the line to create the version number, while saving the current transaction version number to the original deleted row

Through MVCC, although each row of records requires additional storage space, more row checking work and some additional maintenance work, but can reduce the use of locks, most of the read operation is not locked, read data operation is very simple, good performance, and also can be guaranteed to read only the standard line, and only lock the necessary rows.

Whether we learn from a database textbook or see it on the web, most of the four isolation levels in the transaction above mean that the RR level is repeatable, but cannot solve the phantom read, but only at the serializable level. So I added a transaction C to show the effect. When you add a teacher_id=1 data in transaction C, the COMMIT,RR level should have a phantom read, and transaction a reads the newly added data of transaction C when querying the teacher_id=1 data. However, the test found that there is no such situation in MySQL, after transaction C is committed, transaction A will not read this data. Visible in the MySQL RR level, it solves the read problem of Phantom read. See

Read the problem solved, according to the definition of MVCC, the concurrent submission of data conflict, then how to resolve the conflict? Let's look at the RR level in InnoDB for the processing of the write data.

# # # The difference between "read" and "read"
Some readers may wonder whether the isolation level of a transaction is actually defined as reading data, but here it is broken down to read and write two modules to explain. This is mainly because the read in MySQL, and the read in the transaction isolation level, are not the same.

We also see that, at the RR level, through the MVCC mechanism, although the data becomes repeatable read, the data we read may be historical data, is not timely data, not the database current data! This is likely to be problematic in some businesses that are particularly sensitive to the timeliness of data.

For this way of reading historical data, we call it snapshot read (snapshot read), while reading the database current version of the data, called the current read. Obviously, in the MVCC:

    • Snapshot read: Is Select
      • SELECT * from table ....;
    • Current read: Special read operation, insert/update/delete operation, belongs to the current read, processing is the current data, need to lock.
      • SELECT * FROM table where? lock in Share mode;
      • SELECT * FROM table where? for update;
      • Insert
      • Update;
      • Delete

The isolation level of a transaction actually defines the current level of reading, and MySQL improves concurrency by reducing the time it takes to lock processing (including waiting for other locks), and introduces the concept of snapshot reading, so that select does not have to be locked. and update, insert these "current read", need additional module to solve.

# # #写 ("current read")
While the isolation level of a transaction defines only the requirements for reading data, it is actually a requirement to write data. The "read" above is actually a snapshot read, and the word "write" here is the current reading.
In order to solve the Phantom read problem in the current read, MySQL transaction uses the Next-key lock.

# # # #Next-key Lock
The Next-key lock is a combination of a row lock and Gap (Gap Lock), the row lock has been introduced above, next the gap gap lock.

Row locks prevent data conflicts when data is submitted for different transactional versions. But how to avoid inserting data into other transactions is a problem. We can look at the comparison of RR level and RC level

RC level:

Transaction A Transaction B
Begin

Begin

Select id,class_name,teacher_id from Class_teacher where teacher_id=30;

ID class_name teacher_id
2 Third Day Second class 30

Update Class_teacher set Class_name= ' third day four classes ' where teacher_id=30;

INSERT into class_teacher values (null, ' third day second class ', 30);

Commit

Select id,class_name,teacher_id from Class_teacher where teacher_id=30;

ID class_name teacher_id
2 Third Day Class Four 30
10 Third Day Second class 30

RR level:

Transaction A Transaction B
Begin

Begin

Select id,class_name,teacher_id from Class_teacher where teacher_id=30;

ID class_name teacher_id
2 Third Day Second class 30
Update Class_teacher set Class_name= ' third day four classes ' where teacher_id=30;

INSERT into class_teacher values (null, ' third day second class ', 30);

Waiting ....

Select id,class_name,teacher_id from Class_teacher where teacher_id=30;

ID class_name teacher_id
2 Third Day Class Four 30
Commit After the transaction acommit, insert for transaction B executes.

By contrast we can find that in the RC level, transaction a modifies all teacher_id=30 data, but when the transaction binsert new data, transaction a discovers that there is more than one row of teacher_id=30 data and is not modified by the previous UPDATE statement , this is the "current read" of the Phantom read.

At the RR level, transaction a locks after update and transaction B cannot insert new data so that transaction A is consistent with the data read before and after the update, avoiding Phantom reads. This lock is the gap lock.

This is how MySQL is implemented:

In the Class_teacher table, teacher_id is an index, so it maintains a set of data relationships for a B + tree, and for simplicity we use the list structure to express (actually a tree structure, but the same principle)

, InnoDB uses a clustered index, and teacher_id as a level two index, maintaining a tree structure of index fields and primary key IDs (which are represented here in the form of lists) and keeping them in order.

InnoDB the data into a few intervals.

    • (Negative infinity, 5],
    • (5,30],
    • (30,positive infinity);

Update Class_teacher set Class_name= ' Third day class ' where teacher_id=30, not only with row locks, the corresponding data row is locked, but also on both sides of the interval, (5,30] and (30,positive Infinity), have been added to the gap lock. In this way, transaction B cannot insert new data into this two interval.

Limited by this implementation, InnoDB often locks the interval that does not require a lock. As shown below:

Transaction A Transaction B Transaction C
Begin Begin Begin

Select id,class_name,teacher_id from Class_teacher;

ID class_name teacher_id
1 Third Day Class One

5

2 Third Day Second class 30
Update Class_teacher set Class_name= ' The first class ' Where teacher_id=20;

INSERT into class_teacher values (null, ' first 35 shifts ', 10);

Waiting .....

INSERT into class_teacher values (null, ' first 35 shifts ', 40);
Commit After transaction a commits, the statement is inserted successfully Commit
Commit

Update Teacher_id=20 is in the (5,30] interval, even if no data is modified, InnoDB will also add gap lock in this interval, and other intervals will not affect, transaction c normal insertion.

If you are using a field that does not have an index, such as update class_teacher set teacher_id=7 where Class_name= ' first Class 38 (even if no data is matched), then a gap lock is added to the whole table. At the same time, it cannot pass the MySQL server filter to automatically unlock the lock that does not satisfy the condition, as the row lock above, because there is no index, these fields are not sorted, there is no interval. Other transactions cannot insert any data unless the transaction commits.

Row locks prevent other transactions from being modified or deleted, gap locks prevent other transactions from being added, and Next-key locks formed by the combination of row and gap locks together solve the problem of the RR level's phantom reading when writing data.

# # #Serializable
This level is simple, read plus shared lock, write exclusive lock, read and write mutex. The pessimistic locking theory used is simple, the data is more secure, but the concurrency ability is very poor. You can use this mode if your business is exceptionally small or not concurrent and requires timely and reliable data.

Here to vomit trough a sentence, do not see Select to say will not add lock, in serializable this level, still will add lock!

Resources
      • MySQL reference manual
      • The third edition of high-performance MySQL

The relationship between the transaction isolation level and the lock in Innodb

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.