http://mysql.taobao.org/monthly/2016/03/10/
Objective
More locks have been dealt with in recent times, including lock waits that cause business connections to accumulate or time out, deadlocks cause business failures, and so on, which can have a serious impact on the business, and users who have no experience in dealing with them often fail to do so. The following will be from the entire database design, development, operation and maintenance phase describes how to avoid the occurrence of lock problems, provide some best practices for the RDS user reference.
Design phase
In the database design phase, improper engine selection and index design can lead to more serious lock or deadlock problems after the business goes live.
1. The table engine chooses to use MyISAM, which raises table level lock wait.
Starting with version 5.5, MySQL officially shifted the default engine from MyISAM to InnoDB, the main difference between the two engines:
Because the MyISAM engine only supports table lock, during the use of the MyISAM engine table, when a query with a long execution time in the database becomes clogged with the update action on the table, it is often encountered that the thread session is in table-level lock waiting (waiting for table Lock), in severe cases where the application is not properly connected due to the full number of instance connections
CREATETABLE' T_myisam ' (' ID 'int11)DEFAULTNULL)Engine=myisamDEFAULTCharset=utf8;Query |111 |UserSleep |Select ID,Sleep100) from t_myisam | Query | 108 | Waiting for table level lock | update t_myisam set id=2 where id=1| Query | 3 | Waiting for table level lock | update t_myisam set id=2 where id=1|
As can be seen from the above example, the T_myisam table is the MyISAM storage engine, when the table has a long execution time of the query statement at the time of execution, the table on the other updates are blocked, this time the application or database connection quickly exhausted, resulting in application request failure. The quick way to recover this time is to kill the thread id:111 (you can do show processlist to see all the connection states of the current database). In addition, the table index of the MyISAM storage engine causes the index to be corrupted when the instance is closed unexpectedly, and the table must be repair, which also blocks all requests on the table.
2. The table index was improperly designed, causing the database to deadlock.
Index design is a very important part of database design, not only related to the performance of subsequent business, if the design is improper can also lead to business deadlock. The following is a case of online system, the database in the concurrent update of the deadlock, the location of the update by locating using two indexes caused by the deadlock information as follows:
(1) transaction:transaction 29285454235, ACTIVE 0.001 sec fetching rowsmysql tables inUse3, Locked3LOCK WAIT6Lock struct (s), heapSize1184,4RowLock (s) MySQL thread ID6641616, OS thread handle0x2b165c4b1700,Query ID2819042793710.103.180.86 Test_ebs SearchingRowsForUpdateUPDATE TestSET is_deleted =1WHERE group_id =1332577and test_id =4580605*** (1) WaitingFor thisLOCKTo be Granted:record LOCKSSpace IDPageNo37122 N Bits65WIndex' PRIMARY 'OfTable' Testdriver '.' Test ' TRX ID29285454235 Lock_mode X Locks Rec butNot Gap WaitingrecordLock, HeapNo179 Physical Record:n_fields8; Compact format; Info bits 0*** (2) transaction:transaction 29285454237, ACTIVE 0.001 sec fetching rows, thread declared inside InnoDB 4980 MySQL tables inUse3, Locked35Lock struct (s), heapSize1184,3RowLock (s) MySQL thread ID6639213, OS thread handle0x2b1694cc2700,Query ID2819042793910.103.180.113 Test_ebs SearchingRowsForUpdateUPDATE TestSET is_deleted =1WHERE group_id =1332577and test_id = 4212859*** (2) holds the LOCK (S): RECORD LOCKS Space ID page no 37122 n b Its 352 index ' PRIMARY ' of table ' Testdriver '. ' Test ' Trx ID 29285454237 lock_mode X locks rec but not Gaprecord lock and Heap no 179 physical RECO Rd:n_fields 8; compact format;
Table structure:
CREATETABLE' Test ' (' ID 'bigint20)NotNULL auto_increment COMMENT ' primary key ',' test_id 'bigint20)DEFAULTNull' group_id 'bigint20)DEFAULTNULL COMMENT ' Id, corresponding to Test_group.id ',' Gmt_created ' datetimedefault null COMMENT ' creation time ',' gmt_modified ' datetime DEFAULT null COMMENT ' modified time ',' is_deleted ' tinyint (4) DEFAULT '0 ' COMMENT ' deleted. ',PRIMARY key (' id '),key ' Idx_testid ' (' test_id '),key ' idx_groupid ' (' group_id ')) engine=InnoDB auto_increment=7429111;
SQL Execution Plan:
test SET is_deleted = 1 WHERE group_id = 1332577 and test_id = 4212859| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra || 1 | SIMPLE | test | index_merge | idx_testid,idx_groupid | idx_testid,idx_groupid | 9,9 | | 1 | Using intersect(idx_testid,idx_groupid); Using where; Using temporary |
So the first transaction is based on the group_id index, the primary ID is locked, then the primary ID is locked according to the test_id index, the second transaction is based on the test_id index, the primary ID is locked, and then according to the group_id index, To lock the primary ID, so that a concurrent update could result in a dead index.
MySQL official has also confirmed this bug:77209, there are two ways to solve the problem:
- Plus test_id + group_id combination index, so you can avoid the index merge;
- Optimizes the index merge of the optimizer to close.
Development phase
Transaction processing time is too long, resulting in concurrent lock waits.
Concurrent transactions are often seen in the database of the application scenarios, in this scenario, the need to avoid large transactions, long transactions, complex transactions caused transactions in the database to run longer, transaction time is longer, resulting in transaction lock holding time is longer, affecting the overall database throughput. In the following example, a large number of update waits occur in the user's business database, resulting in a large number of business super-times errors:
Troubleshoot the problem:
By show processlist
determining the thread in which the lock waits and the SQL information:
|Query|37|updating|UPDATE test_warning SET ... WHERE test_id = ‘46a9b‘
Lock waiting and transaction attempts through InnoDB's INFORMATION_SCHEMA database to isolate the relevant lock information:
Select r.trx_mysql_thread_id waiting_thread, r.trx_id waiting_trx_id,r.trx_query waiting_query, b.trx_id Blocking_trx _id, B.trx_query blocking_query,b.trx_mysql_thread_id blocking_thread,b.trx_ started,b.trx_wait_startedFrom Information_schema.innodb_lock_waits WInnerJoin Information_schema.innodb_trx bon b.trx_id =w.blocking_trx_id inner joininformation_schema.innodb_trx R on r.trx_id= w.requesting_trx_id \gwaiting_thread: 318984063waiting_trx_id: 26432631waiting_query: update test_warning set ... where test_id = ' 46a9b ' blocking_trx_id: 26432630blocking_query: nullblocking_thread: 235202017trx_started: 2016-03- 01 13: 54: 39
From the lock waiting information described, transaction 26432631 is blocked by 26432630, then we can from the general log to find out what the transaction 26432630 did.
Finding the location of the above BLCOK UPDATE statement in the log from the general log of the pre-opened database and discovering the root cause of the update being BLCOK: There are two different SQL in the log that update the same record concurrently. So after the updated SQL will wait for the previous update of SQL, if SQL1 is located in the transaction 1 is not committed, then transaction 2 will wait, so that the status of the above updating
UPDATE test_warning ..... WHERE test_id = ‘46a9b‘318984063 Query UPDATE test_warning ..... task_order_id = ‘‘ WHERE test_id = ‘46a9b‘
So we can look at the context of this transaction:
The SQL context for thread id=235202017:
SET autocommit=0235202017 (13:54:39) Query UPDATE test_warning SET .... WHERE test_id = ‘46a9b‘235202017 Query commit
The SQL context for thread id=318984063:
318984063 QuerySET autocommit=1318984063 (13:54:39)QuerySELECT ....From Test_waybillWHERE (test_idIn (' 46a9b '))318984063QuerySELECT ...from test_waybill where test_id = 46a9b ' 318984063 query UPDATE test _warning set ..... where test_id = ' 46a9b ' 318984063 ( Span class= "Hljs-number" >13:55:31) query update test_waybill_current t ... where t.test_id in ( ' 46a9b ') 318984063 query SET autocommit=< Span class= "Hljs-number" >0
You can see that transaction 1 starts from 13:54:39 until 13:55:30, and transaction 2 has the same record in update Transaction 1, so until transaction 1 commits, transaction 2 is executed, and with such a log, this information is sent to the user to find the problem quickly. In transaction 1, due to the existence of other business logic, the submission of transaction 1 was not completed, which led to the other business locks.
Maintenance phase
DDL operations are queried for block.
When the application goes online into the maintenance phase, there will be more database change operations, such as adding fields, adding indexes and other operations, this kind of operation caused the lock failure is very frequent, the following will introduce a case, a DDL operation is queried block, resulting in database connection accumulation:
Query |6 | User Sleep |Select ID,Sleep50)From TQuery |4 | WaitingForTable metadataLock |AlterTable TAddColumn Gmt_create datetimeQuery |2 | WaitingForTable metadatalock | select * from t where Id=1query | 1 | Waiting for table metadata lock | Span class= "Hljs-keyword" >select * from t where Id=2query | 1 | Waiting for table metadata lock | Span class= "Hljs-keyword" >update t set ID =2 where id=1
Metadata lock wait means: To maintain data consistency for table metadata in a concurrency environment, you cannot write to metadata when there are active transactions (explicit or implicit) on the table. So MySQL introduced metadata lock to protect the table's metadata information. Therefore, if there is an active transaction on the table (uncommitted or rollback) on the table, the requested write session waits at Metadata lock wait.
Common factors that cause Metadata lock Wait Wait include: active transaction, there is currently a long query on the table, show or implicitly open the transaction is not committed or rollback, such as after the completion of the query is not committed or rollback, the table has failed query transactions.
In the above case, the thread state of the query, update, and DDL operations is waiting for table metadata lock, the operation of table T is all blocked, the front-end flow of requests to reach the database, this time the database connection is easily filled, So let's analyze why these locks are waiting:
- Alter-action lock wait: Due to an action to add a field on table T, the operation will get a metadata lock on the table before the end, but there is already a query on the table that has not been completed, causing metadata lock to never get it, So the Add Field operation can only wait for the query to end, which explains why the alter operation has a status of waiting for table metadata lock.
- Lock waits for queries and updates: Because the alter operation previously attempted to acquire metadata lock on the T table, subsequent query and update operations on table T are blocked by the alter operation when acquiring metadata lock. This causes these thread states to be waiting for table metadata lock.
The workaround is to kill thread 6.
Summarize
Lock problem is a very common problem, we need to be in the database development, design, management of all stages need to pay attention to prevent the situation, to be aware of.
Design and development phase:
- Table design to avoid using the MyISAM storage engine, switch to InnoDB engine;
- Create an appropriate index for SQL to avoid multiple single-column index execution errors;
- Avoid large transactions, long transactions, and complex transactions that cause transactions to run longer in the database.
Manage Operations Phase:
- Perform the above operations during low peak periods, such as creating a delete index;
- Before the structure changes, observe whether there are long-running SQL, uncommitted transactions in the database;
- During a fabric change, there is a lock wait for the thread state of the monitoring database.
MySQL FAQ · MySQL Lock problem Best practices