Analysis of Gap gap lock with deadlock problem

Source: Internet
Author: User
Tags rollback
A service on the line occasionally reports the following exceptions (approximately 20 times a day): "Deadlock found when trying to get lock;".

Oh, my god! is a deadlock problem. Although the error is not much, the performance does not appear to have much impact, but still need to solve, Baobuzzi which day to become a performance bottleneck.
In order to analyze the problem more systematically, this paper discusses five aspects, such as deadlock detection, relationship between index isolation level and lock, deadlock cause and problem orientation.

Figure 1 Application Log 1 How the deadlock was found. 1.1 Deadlock Genesis && detection method

Is the two cars causing a deadlock in the left picture? No. Right figure four cars cause a deadlock.                                                                       Is. Figure 2 Deadlock Description

The storage engine we use for MySQL is InnoDB, and from the log, InnoDB proactively detected the deadlock and rolled back a hard waiting transaction. The question is, how innodb the deadlock.

The intuitive approach is to rollback one of the transactions when a wait time exceeds one of the set thresholds when two transactions are waiting for each other, and the other transaction continues to execute. This method is simple and effective, and in InnoDB, the parameter innodb_lock_wait_timeout is used to set the timeout time.

Using the above method only to detect the deadlock is too passive, InnoDB also provides a wait-for graph algorithm for active deadlock detection, whenever the lock request can not immediately meet the needs and enter the wait, Wait-for graph algorithm will be triggered. 1.2 wait-for Graph principle

How do we know that the four cars in the picture above are dead locks. They wait for each other's resources and form loops. We look at each car as a node, and when node 1 needs to wait for Node 2, a directed edge is generated to point 2 and a directed graph is formed. We only need to detect whether this has a loop to the graph, the emergence of the loop is a deadlock. This is the Wait-for graph algorithm.
Figure 3 Wait for graph

InnoDB each transaction as a node, resources are the locks occupied by each transaction, when the transaction 1 needs to wait for the transaction 2 lock, it generates a directed edge from 1 to 2, the last row into a directed graph. 1.2 InnoDB Isolation level, index and lock

Deadlock detection is the INNODB to us when the deadlock occurs, we need it, but what we need more is the ability to avoid deadlocks and how to avoid them as much as possible. This requires understanding the locks in the InnoDB. the relationship between 1.2.1 Lock and index

Let's say we have a message table (MSG) with 3 fields in it. Assuming the ID is the primary key, token is a unique index and the message is not indexed.

Id:bigint

Token:varchar (30)

Message:varchar (4096)

InnoDB uses a clustered index for the primary key, which is a way of storing data that is stored with the primary key, and the leaf node that the primary key index stores the row data. For a normal index, the leaf node stores the primary key value.

Fig. 4 Clustered index and level two index
The following is an analysis of the relationship between the index and the lock.
1) Delete from msg where id=2;

Because the ID is the primary key, you can lock the entire row of records directly.
Figure 5
2 Delete from msg where token= ' CVS ';

Because the token is a level two index, first lock the Level two index (two lines), then lock the corresponding primary key corresponding record;
Figure 6
3 Delete from MSG where message= order number ';

The message has no index, so it's a full table scan filter. Each record on the table is then added with an X lock.
Fig . 7 Relationship between 1.2.2 Lock and isolation level

University database principles have been learned, in order to ensure the correctness of concurrent operation data, the database has the concept of transaction ISOLATION LEVEL: 1 uncommitted read (READ UNCOMMITTED), 2 read (RC), 3) repeatable READ (repeatable Read (RR)); 4 serializable (Serializable). We use RC and RR more often.

Submit Read (RC): Only data that has been submitted can be read.

Repeatable Read (RR): Queries within the same transaction are all at the beginning of the transaction, InnoDB the default level.

What we're talking about in 1.2.1 is actually the lock under the RC isolation level, which prevents data conflicts when data modifications are submitted by different transaction versions, but problems can occur when other transactions insert data.

As the following illustration shows, transaction A gets 1 records on the first query, and two records when the same query is executed the second time. From the point of view of transaction A It was a hell of a sight. This is the Phantom read, the RC level, although the addition of row locks, but still can not avoid phantom reading.

Figure 8

The RR isolation level of InnoDB avoids the occurrence of phantom reads and how they are implemented. Of course, we need to use the lock.

In order to solve the problem of Phantom reading, InnoDB introduced Gap lock.

Executing in transaction A: Update msg set message= ' order ' where token= ' ASD ';

InnoDB will first add an X lock to the records on the index, as well as the RC level, plus locks on the non-unique index ' ASD ' and the interval of the adjacent two indexes.

In this way, when transaction B executes insert into MSG values (null, ' ASD ', ' hello '); When a commit occurs, it first checks whether the interval is locked and, if locked, cannot be executed immediately, and waits for the gap lock to be released. This avoids the problem of phantom reading.
Figure 9

Recommend a good article, you can deeply understand the principle of the lock: http://hedengcheng.com/?p=771#_Toc374698322 3 deadlock causes

After understanding the basic principle of InnoDB lock, the following analysis of the causes of the deadlock. As mentioned above, deadlocks are generally a matter of waiting for each other's resources and eventually forming a loop. Here's a simple example of how to wait for the final form of the loop. 3.1 Different table same record row lock conflict

This is well understood, transaction A and transaction B operate on two tables, but there is a looping wait lock.

Figure 3.2 row lock conflict with same table record

This is a common situation, where two jobs were encountered when performing a data bulk update, the list of IDs processed by Joba is [1,2,3,4], and the job-processing ID list is [8,9,10,4,2], which causes a deadlock.

Figure 3.3 Different index lock conflicts

This situation is more obscure, when transaction A is executed, in addition to locking in the level two index, it will also be locked on the clustered index, and the order of locking on the clustered index is [1,4,2,3,5], while transaction B is executed with only a lock on the clustered index, and the lock order is [1,2,3,4,5], which creates the possibility of deadlock.

Figure 3.4 Gap lock conflict

InnoDB at the RR level, as in the case of a deadlock, is more obscure. Unclear students can be based on the gap in the last section of the principle analysis.
Figure 4 How to avoid deadlocks as much as possible

1 Access tables and rows in a fixed order. For example, for the 2nd section two job batch updates, the simple method is to sort the ID list first, then execute, so as to avoid the situation of cross wait lock, for example, in the case of section 3.1, the SQL order of two transactions is adjusted to be consistent, also can avoid deadlock.

2 large business small. Large transactions tend to deadlock, and if business permits, large transactions are smaller.

3 in the same transaction, as much as possible to lock all the resources needed to reduce the deadlock probability.

4) Reduce the isolation level. If the business allows, it is also a good choice to lower the isolation level, for example, by adjusting the isolation level from RR to RC to avoid a lot of deadlocks due to gap locks.

5) to add a reasonable index to the table. You can see that if you do not go the index will add locks for each row of the table, the probability of deadlock greatly increases. 5 How to locate the cause of deadlock

The following is an example of a deadlock case at the beginning of this article on how to troubleshoot the cause of deadlocks.

1 by applying the business log to locate the problem code, find the corresponding SQL of the transaction;

Because the deadlock is detected and rolled back, the information is reacted to an exception in the application's business log, through which we can locate the corresponding code and comb the SQL of the transaction.

1 2 3 4 5 Start Tran 1 Deleteheartcheckdobytoken 2 updatesessionuser ... commit

In addition, we found that the transaction was rolled back when the deadlock was detected, based on information from the log rollback.

2 Determine the level of database isolation.

Execute SELECT @ @global. tx_isolation, you can determine the isolation level of the database, our database isolation level is RC, which can be very large probability to exclude the gap lock caused by the suspicion of deadlock;

3 Find a DBA to execute the show InnoDB status to see the recent deadlock log.

This step is critical. With the help of DBAs, we can have more detailed deadlock information. With this detailed log, you can see that the transaction structure that conflicts with the previous transaction is as follows:

1 2 3 4 5 Start Tran 1 Updatesessionuser 2 deleteheartcheckdobytoken ... commit

This is not the dead lock described in Figure 10.

Reprint please indicate the source address: Http://www.cnblogs.com/LBSer


The above are summed up by others, concrete can also refer to this article, wrote a good: http://hedengcheng.com/?p=771#_Toc374698322

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.