MySQL master-slave delay replication practice and production failure case recovery practice

Source: Internet
Author: User
Tags iptables egrep

1.1

MySQL master-Slave delay Replication Introduction

From MySQL5.6 to support the master-slave delay replication, the main problem is that when the main library has a logical data deletion or error updates, all the library will be the wrong update, resulting in all database data anomalies, even if there is a scheduled backup data can be used for data recovery, especially when the database data is large, the recovery time will be very long , the database data is deleted or the error data is affected by the normal access experience during recovery.

Delay replication can be a good solution to this problem. For example, you can set an update from the library and the main library to delay 1 hours, so that the main library data out of the problem, within 1 hours of discovery, can be harmless recovery from the library processing, so that it is still the correct complete data, eliminating the time of data recovery, the user experience has increased.

1.2

MySQL master-slave delay replication Configuration practice

The MySQL5.6 version of the deferred replication configuration is implemented by executing the following command on slave:

Change MASTER to Master_delay = N;
The Master_delay option is added directly #读者可在配置延迟从库Change master.

This statement sets the slave database to delay n seconds and then replicates the data with the primary database by logging on to the Slave database server (this is 52) and then executing the following command.

mysql> stop Slave;

Query OK, 0 rows affected (0.45 sec)

mysql> change MASTER to Master_delay = 20;  

#这是延迟的核心命令.

Query OK, 0 rows affected (0.22 sec)

mysql> start slave;

Query OK, 0 rows affected (0.15 sec)

Mysql> Show Slave Status\g

1. Row ***************************

... Omit a number of ...

Slave_io_running:yes

Slave_sql_running:yes

... Omit a number of ...

Sql_delay:20

#这里的数字就是设置的延迟20秒后进行复制.

Sql_remaining_delay:null

#还剩多少秒执行复制.

Slave_sql_running_state:slavexx to update it

#SQL线程的状态.

... Omit a number of ...

1 row in Set (0.09 sec)

The three state parameters commonly used in the replication state are Sql_delay, Sql_remaining_delay, slave_sql_running_state, which indicate that they have been commented separately.

Main Library Insert data:

mysql> CREATE DATABASE Lanlan;

Query OK, 1 row Affected (0.00 sec)

After the main library has inserted data for 1 seconds, execute show databases from the library, and see if the data is synchronized in time, with the following results:

mysql> show databases;

+--------------------+

| Database |

+--------------------+

| Information_schema |

| Alex_python |

| MySQL |

| Performance_schema |

+--------------------+

When you do not see the database Lanlan created on the main library from the library, perform the intermittent execution show slave status\g view the parameter status of the delay as follows output.

Mysql> Show Slave Status\g

... Omit a number of ...

Sql_delay:20

Sql_remaining_delay:13

#剩于13秒执行复制.

Slave_sql_running_state:waiting until master_delay seconds after MASTER executed event

... Omit a number of ...

1 row in Set (0.00 sec)

Mysql> Show Slave Status\g

... Omit a number of ...

Sql_delay:20

Sql_remaining_delay:9

#剩于9秒执行复制.

Slave_sql_running_state:waiting until master_delay seconds after MASTER executed event

... Omit a number of ...

1 row in Set (0.00 sec)

Mysql> Show Slave Status\g

... Omit a number of ...

Sql_delay:20

Sql_remaining_delay:null

#复制完成后, there is no status for new data updates.

Slave_sql_running_state:slave have read all relay log; Waiting for the slave I/O thread to update it

... Omit a number of ...

1 row in Set (0.00 sec)

View the trunk log from the library when no update data from the library is in the deferred replication time period.

[Email protected] data]# pwd

/application/mysql/data

[Email protected] data]# Mysqlbinlog db02-relay-bin.000002

SET @ @session. lc_time_names=0/*!*/;

SET @ @session. collation_database=default/*!*/;

Create DATABASE Lanlan

#中继日志确已经有了创建的语句, indicates that the IO thread is still working in real time.

1.3

MySQL Delay Replication principle analysis

MySQL's deferred replication actually affects only the SQL thread that applies the data to the slave database, and the I/O thread has already written the primary library's updated data to the trunk log from the library, so that during deferred replication even if the main library goes down, from the library to the time of the deferred replication, The data is still updated to coincide with the main library outage.

Special note: In fact, MySQL delayed replication function as early as a few years ago, the old boy teacher has already realized this function with the thought, and applies in the enterprise production backup and restores, the method is as follows:

1) Section 15.2 has been introduced, the execution of mysql> stop slave sql_thread; The SQL thread is stopped, then backed up, the main library goes down during the backup, but the binlog of the main library is still sent to the slave library in time. Eventually the library can still be restored to and from the state before the main library goes down.

2) Write a script, the use of timed tasks to control the Sql_thread stop and run, and then the library can control the implementation of a simple library delay replication function, which is the importance of thought. Of course, the 5.6 version of the software provided by the feature bar, 5.6 of the previous database to achieve delayed replication, you can think of the old boy used to delay the backup and the idea of delayed replication.

1.4 MySQL Deferred replication recovery case Practice

In the enterprise, we want to according to the business requirements of delay replication to specify a time period, such as 1 hours after the copy from the library, then in this one hours, if the main library mistakenly update the data, then the other from the library also innocently update the data, how to restore the delay from the library to normal without mistakenly update the full state before the data? and see the practice below.

1.4.1 Adjust the delay from library to 1 hours

Simulate the environment and adjust the delay from library to 3,600 seconds;

mysql> stop Slave;

Query OK, 0 rows Affected (0.00 sec)

mysql> change MASTER to Master_delay = 3600;

Query OK, 0 rows affected (0.03 sec)

mysql> start slave;

Query OK, 0 rows affected (0.08 sec)

Mysql> Show Slave Status\g

... Omit a number of ...

sql_delay:3600

sql_remaining_delay:2414

Slave_sql_running_state:waiting until master_delay seconds after MASTER executed event

... Omit a number of ...

1 row in Set (0.00 sec)

1.4.2 simulating writing data in the main library

Writes 1 libraries every 5 seconds, when the impersonated user writes the data.

[[email protected] ~]# for n in {1..5}

> Do

> mysql-e "CREATE Database Oldboy$n"

> Sleep 5

> Done

Tip: Shell scripting knowledge can refer to the book "Learning Linux Operations with older boys: Shell programming Combat".

1.4.3 Simulation of human destruction data

The simulated vandalism data can also be an UPDATE statement without a where.

mysql> drop Database oldboy5;

#删除oldboy5数据库, the back of this data back, and other data to have.

Query OK, 0 rows Affected (0.00 sec)

Mysql> show databases like ' oldboy% ';

+--------------------+

| Database (oldboy%) |

+--------------------+

| oldboy1 |

| Oldboy2 |

| Oldboy3 |

| Oldboy4 |

+--------------------+

4 rows in Set (0.00 sec)

#此时, all from the library is already bad data, only the delay from the library is good, but an hour ago data.

1.4.4 Stop Write Library recovery data

When the database is mistakenly deleted data, in order to complete the recovery of data (especially update without conditional destruction of data), it is best to choose to stop the external access measures, the need to sacrifice the user experience, unless the business can tolerate data inconsistency, and not be destroyed two times. From the library can continue to open to the user read access, but also may cause the user to read the data is bad data, need readers to measure data consistency and user experience problems. This example uses Iptables to block all user access to the main library.

[[email protected] ~]# iptables-i input-p TCP--dport 3306! -S 172.16.1.51-j DROP

#非172.16.1.51 prohibits access to database 3306 ports, and 51 is the main library IP.

1.4.5 Check Binlog send and receive is complete

Log in to the main library to perform show processlist, confirm that the Binlog is all sent to the delay from the library, and, of course, can log on delay to execute show processlist from the library, or whether to receive full part Binlog from the library IO thread for status query acknowledgement.

Mysql> show Processlist;

+----+---------------------------------------+---------------+

| 12 | Rep | 172.16.1.52:39043 | NULL |  Binlog Dump | 709 | Master has sent all binlog to slave; Waiting for Binlog to be updated | NULL |

+----+------------------------------------+------------------+

2 rows in Set (0.00 sec)

#上述提示表示主库已经发送完所有Binlog日志到从库了.

1.4.6 pauses master-slave replication from the library and checks the data

Execute stop slave from the library, pause master-slave replication, and see if the database is synchronized.

mysql> stop Slave;

mysql> show databases;

+--------------------+

| Database |

+--------------------+

| Information_schema |

| Alex_python |

| MySQL |

| Performance_schema |

+--------------------+

4 rows in Set (0.00 sec)

#提示: The data is not synchronized to the delay from the library because the delay time is not yet reached.

1.4.7

locating recovery data corresponding to the relay log

Based on the SQL thread that Relay-log.info records, the location resolution of the relay-log is not applied to the Relay-bin log from the library.

[email protected] data]# pwd  

#进入到中继日志所在的目录.

/application/mysql/data

[Email protected] data]# ls-l *relay*

#查看中继日志相关信息.

-RW-RW----. 1 mysql mysql 172 September 17:32 db02-relay-bin.000001

-RW-RW----. 1 MySQL mysql 993 September 17:37 db02-relay-bin.000002

-RW-RW----. 1 MySQL mysql 48 September 17:32 Db02-relay-bin.index

-RW-RW----. 1 MySQL MySQL 61 September 17:32 Relay-log.info

#SQL线程读取中继日志位置信息.

[email protected] data]# cat Relay-log.info

#查看中继日志应用的位置信息.

7

./db02-relay-bin.000002 #SQL线程读取中继日志的文件名信息.

284 #SQL线程读取中继日志位置点信息.

oldboy-bin.000024

309

3600

0

1

1.4.8

Parse the required trunk log

Resolves all remaining relay-bin trunk log data unresolved by the SQL thread, because the amount of analog data is not large enough, Therefore, in this case only db02-relay-bin.000002 a relay log, the actual work may have more than one, together with the resolution of a specified file or different files can be stored.

[Email protected] data]# mysqlbinlog--start-position=284 db02-relay-bin.000002 >relay.sql

#根据上述的relay-log.info Intermediate log files and location information to parse the relay log, the use of this command has been explained in the previous section, this is not exhausted.

1.4.9

Remove the problem SQL from the parse file

The SQL statement that destroys the database data is found and removed from the parsed SQL statement, which is "drop database Oldboy5".

[email protected] data]# egrep "drop database Oldboy5" Relay.sql

#检查是否存在误删的SQL语句.

Drop Database Oldboy5

[email protected] data]# sed-i '/drop database oldboy5/d ' relay.sql

#删除, take care not to delete too much.

[email protected] data]# egrep "^drop database Oldboy5" Relay.sql   

#检查删除结果.

1.4.10

Restore the processed SQL file to the database

Restores the parsed and processed relay.sql data files back to the delay from the library.

[Email protected] data]# Mysql<relay.sql

#这步就是从停止slave复制开始, by manually restoring all remaining log data to the database according to the Relay-log.info location, you need to be aware that it is time to clean up the statement that destroyed the database before recovering.

[Email protected] data]# mysql-e "show databases like ' oldboy% ';"

+--------------------+

| Database (oldboy%) |

+--------------------+

| oldboy1 |

| Oldboy2 |

| Oldboy3 |

| Oldboy4 |

|  Oldboy5 | #被删除的oldboy5数据库已经找回.

+--------------------+

To this end, using the deferred database to recover data, the library is promoted to the main library (see manual master-slave role Switch chapter content), the VIP point to the "delay from the library", that is, the new main library provides user access, and then in the other damaged master and slave database repair.

MySQL master-slave delay replication practice and production failure case recovery practice

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.