Bug #19528825 "unable to purge a record ",
Overview:
In the production environment, when insert buffer is enabled (innodb_change_buffering = all), some instances occasionally encounter the "unable to purge a record" error. This is actually an old bug. From the official bug system, other users have encountered similar problems, and the official version has fixed the bug in May February this year.
Insert buffer:
Before describing the cause and solution of the problem, let's briefly talk about insert buffer. In short, insert buffer is a cache for secondary index operations. During the secondary index operation, if the requested page is not in the buffer pool, the secondary index operation is cached. When the page is read next time, the operations in the page and buffer are merged, in this way, you can reduce random reads on the disk during UPDATE (INSERT, DELETE, UPDATE, and so on) operations, and increase the UPDATE response time.
The earliest insert buffer only supports the buffer for insert operations. Therefore, this buffer is called insert buffer. By mysql5.6, it not only supports the buffer operation for insert operations, but also supports update, delete, the buffer function of purge and other operations, and the insert buffer is also called the changing buffer, which is later referred to as ibuf. By setting the innodb_change_buffering parameter, You can flexibly control whether the buffer for the secondary index operation is enabled or not. Ibuf only supports common secondary indexes and does not work for unique and primary key indexes. We know that the innodb table is an index organization table. Each table consists of a clustered index and several secondary indexes. If ibuf is used, an ibuf operator is generated for each update operation on the secondary index. The following table lists the correspondence between the ibuf operator and the update statement:
Operation |
Insert buffer Operator |
Description |
Insert |
IBUF_OP_INSERT |
|
Delete |
IBUF_OP_DELETE_MARK |
The Delete statement marks the deletion record. |
Purge |
IBUF_OP_DELETE |
Purge actually cleans and deletes the mark records physically |
Update |
IBUF_OP_DELETE_MARK IBUF_OP_INSERT |
Secondary Index update consists of Delete + Insert. |
Purge:
The purge mentioned above is not an update statement initiated by the user, but a background cleanup operation required when the innodb Storage engine implements MVCC (Multi-version concurrency control. The multi-version implementation mechanism of Innodb (using rollback segments to save historical version information and associating ROLLPTR in records with rollback segment records) When deleting or updating records, it is not a physical deletion, but a deletion mark on the record, which is cleared by the background thread. This process is the purge process. Of course, purge also has another role to clean up rollback segments and reclaim space so that the space can be reused. For the implementation of purge and mvcc, you have the opportunity to organize an article separately.
Cause:
Back to the problem itself, we can reproduce the problem through a simple scenario. Assume that the table t (id int, c1 varchar (100), primary key (id), key (c1) contains a record (1, ); assume that the page of table t is not in the buffer pool, so that ibuf can be used. The operation sequence in the following table can be used to reproduce the problem.
Operation Sequence |
Update statement |
Ibuf operation |
1 [DELETE] |
Delete from t where id = 1; |
IBUF_OP_DELETE_MARK |
2 [INSERT] |
Insert into t values (1, 'A '); |
IBUF_OP_INSERT |
3 [PURGE] |
|
IBUF_OP_DELETE |
The merge operation is performed during page reading. The merge operation is performed in the order of the corresponding page operations in ibuf. When you perform Step 1: IBUF_OP_DELETE, the records to be deleted are not marked as deleted (the records with the same primary key and secondary index are inserted in step 1, and the deletion mark is cleared). An exception is thrown. Since purge operations are executed by separate threads in the background, the operations for executing update statements and purge operations are not strictly sequential, if the order of the preceding operations is 1-> 3-> 2, the problem does not recur.
Important Process:
From the above analysis, the main cause of the problem is that the purge operation and the update operation are not strictly synchronized, so that the purge may clean the records that are not marked with the deletion.
Purge process:
Function call relationship:
Srv_do_purge-> trx_purge-> que_thr_step-> row_purge_step
-> Row_purge-> row_purge_record_func-> row_purge_del_mark
-> Row_purge_remove_sec_if_poss-> row_purge_remove_sec_if_poss_leaf
Purge secondary index process:
Determine whether to use the ibuf call relationship:
Row_purge_remove_sec_if_poss_leaf-> row_search_index_entry-> btr_pcur_open_func-> btr_cur_search_to_nth_level-> ibuf_should_try
Collaboration between update operations and purge operations in ibuf:
When the purge thread obtains the page, if the page is not in the buffer, it sets the watch Mark for the page, and then executes ibuf_insert to cache the purge operation. The update operation (insert, delete, update, and so on) also calls the ibuf_insert operation for buffer. First, it will determine whether the page has a watch Mark. If yes, it is considered to be in conflict with the purge action, ibuf cannot be used. In this case, the page will be read from the disk, and the purge operation will be merged during the page reading process. Subsequent update operations will not cause problems. You can use the watch flag to perform update operations and purge operations in collaboration with ibuf to avoid the problems mentioned above.
Here, you may have a question. If you do not use ibuf, normal update and purge operations are in different threads, and there may also be a (DELETE, INSERT, PURGE) sequence, why is there no problem? Because the row_purge_poss_sec function is also called during the purge secondary index to check whether the record can be purged (when the clustered index corresponding to the secondary index record does not have the delete mark or the trx_id is earlier than the purge view, cannot purge) to avoid the above problems.
Solution:
Similarly, during the purge secondary index process, btr_cur_search_to_nth_level first calls the buf_page_get_gen function to set the watch and then calls row_purge_poss_sec to determine whether the record can be purged. if the user has re-insert, the pur; otherwise, it indicates that no insert record exists. Continue to call the ibuf_insert interface for caching. In the update operation, when using ibuf, insert checks whether the watch tag exists. However, when the program logic returns, the tag is lost, leading to problems. Therefore, as long as you ensure that the update operation is actually using ibuf, check that there is no purge to use ibuf at the same time, the problem can be avoided.
For detailed solutions, refer:
Https://github.com/mysql/mysql-server/commit/ec369cb4f363161dfbbbd662b20763b54808b7d1
Reference:
Http://mysqllover.com /? P = 1264
Https://bugs.mysql.com/bug.php? Id = 73767
Http://hedengcheng.com /? P = 94
Http://mysql.taobao.org/monthly/2015/04/01/