Skip or not? This is a problem, jump or problem

Source: Internet
Author: User

Skip or not? This is a problem, jump or problem
On Monday (), a project was changed, which was a major change. The database test operation continued from morning till seventeen o'clock P.M., and QA colleagues found that, the data on the slave is inconsistent with that on the master.
I forgot to introduce it. This module of this project has read requests and delete logic on it. Are you scared?

This problem should be reasonable but unexpected. In fact, DBA encountered slave problems during DB change in the afternoon:

Could not execute Delete_rows event on table codcpc_1004pc_0x3EC.mail; Can't find record in 'mail', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.010464, end_log_pos 82113, Error_code: 1032141117  7:54:31 [Warning] Slave: Can't find record in 'mail' Error_code: 1032

I skipped a non-existing table directly and wrote the following script:

mysqlconn="mysql -uADMIN -p4word" #while [ "1" = "1" ]for i in `seq 1 30`do   status=`$mysqlconn -e "show slave status\G;" | grep -i "Last_SQL_Error" | grep -i "mail"`    if [ -n "$status" ];then   $mysqlconn -vvve "stop slave;set global sql_slave_skip_counter=1;start slave;"    sleep 1   else     $mysqlconn -vvve "show slave status\G"     echo "error is OK!"     exit 0   fidone

It was strange that I thought I had to write an endless loop and ran it. I didn't expect it to pass the debugging for less than three times. At that time, I didn't go into details, and I didn't have time to study it, which led to the late-night troubleshooting.


The initiator is an SQL statement: delete from mail where recipientEntityID in (select userid from test. deltb );


The error occurred today, mainly because Delete_rows found that the key does not exist. If it is changed to statement, this problem will not occur. The SQL statement that causes the culprit can be executed on slave.
The key is that our binlog_format is in the MIXED mode. So the question is, under what circumstances will the generated binlog be converted to the row format when binlog_format = mixed? I checked the official documentation for this and did not quite understand it. I will keep it updated later.

When binlog_format is in row mode, the skip address is at the transaction level.
In the afternoon delete, the delete on the master will actually convert the binlog into many binlog events, which are surrounded by only a few begin commit events.
This explains my problem above.


So skip or not? SQL _slave_skip_counter = N online jumps are indeed risky.


(1) In InnoDB, N = 1 skips the entire transaction, not just an SQL statement.
(2) In RBR mode, both InnoDB and MyISAM binlog events are organized by begin, row group, and commit. When N = 1, the entire transaction content is skipped.


The safer method is SQL _exec_mode = IDEMPOTENT. Only the problematic SQL or row is skipped. Compared with SQL _slave_skip_counter, slave_exec_mode has a smaller and safer processing range, but it is only applicable to the RBR mode :(

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.