Influence of master-slave library delay on project quality

Last Update:2017-01-01 Source: Internet

Author: User

Tags dedicated server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently in testing a new project, the original project was not present in the master-slave library, and the contents of the server cluster.

However, the new project has been upgraded, and as the business grows, this common server cluster, read-write separation and other basic architecture content must be used.

Problems that arise:

When a system buys a product, it subtracts the purchase amount from the remaining quantity of the product, sends an MQ message, and commits the previous few things as a transaction.

b After the system has heard the MQ message, the user's amount is deducted, the deduction succeeds, the MQ message is sent, a whole transaction

A supervisor hears the MQ of B and inquires into the contents of the purchased product and makes a real deduction for the quantity.

Problem is, a system query product content is from the library to query, but B transaction has been completed, a system at that time in the writing library record content still does not exist from the library, so the entire line of business problems. L

Is the result of MySQL's master-slave synchronization delay.

MySQL master-slave synchronization is a very mature architecture, the advantage is: ① in the slave server can perform query work (that is, we often say that the read function), reduce the primary server pressure; ② in the backup from the primary server, to avoid affecting the primary server service during backup, ③ when the primary server is having problems, you can switch to the slave server.

It is believed that these benefits are well understood and are used in the deployment of the project. But MySQL's master-slave synchronization has been a problem from the library delay, then why this problem. How is this problem solved?

1. mysql Database master-Slave synchronization delay principle.

2. How the MySQL database master-slave synchronization delay is generated.

3. mysql Database master-Slave synchronization delay solution.

1. mysql Database master-Slave synchronization delay principle.

A: When it comes to MySQL database master-Slave synchronization delay principle, from the MySQL database master-slave replication principle, MySQL master-slave replication is a single-threaded operation, the main library for all DDL and DML production Binlog,binlog is sequential write, so high efficiency, slave slave _io_running thread to the main library to fetch logs, the efficiency is very high, next, the problem comes, slave slave_sql_running thread will be the main library of DDL and DML operations in the slave implementation. The IO operations of DML and DDL are immediately, not sequential, expensive, and may also generate lock contention on other queries on slave, since slave_sql_running is also single-threaded, so a DDL card master has to be executed for 10 minutes, Then all subsequent DDL will wait for the DDL to execute before continuing, which results in a delay. A friend will ask: "The same DDL on the main library also needs to perform 10 points, why slave delay?" "The answer is that master can be concurrent and slave_sql_running threads are not."

2. How the MySQL database master-slave synchronization delay is generated.

A: When the TPS for the main library is high, the number of DDL generated exceeds the range that slave a SQL thread can withstand, and the delay is generated, and of course there is the possibility of a lock wait with a large query statement from slave.

3. mysql Database master-Slave synchronization delay solution

A: The simplest solution to reduce the slave synchronization delay is to optimize the architecture and try to get the DDL of the main library to execute quickly. There is the main library is written, the data security is high, such as Sync_binlog=1,innodb_flush_log_at_trx_commit = 1 and other settings, and slave do not need this high data security, fully can speak sync_ Binlog is set to 0 or off Binlog,innodb_flushlog can also be set to zero to increase the efficiency of SQL execution. The other is to use a better hardware device than the main library as a slave.

MYSQL-5.6.3 has supported multi-threaded master-slave replication. Similar to Dinkey, the Dinkey is to do a long thread on the table, Oracle uses a database (schema) for long threads, and different libraries can use different replication threads.

The master/slave mechanism of LAN-based is usually able to meet the requirements of ' real-time ' backup. If the delay is large, confirm the following factors first:
1. Network Latency
2. Master Load
3. Slave load
It is common practice to use multiple slave to apportion read requests, and then take a dedicated server from these slave, only as a backup, without any other action, to achieve the ' real time ' requirement in relative maximum.

Slave_net_timeout units are set to seconds by default to 3,600 seconds

Parameter meaning: How long to wait to reestablish the connection and get the data when slave reads log data from the primary database

Master-connect-retry units are set to seconds by default to 60 seconds

Parameter meaning: When the master-slave connection is re-established, if the connection establishment fails, how long after the interval is retried.

Usually configuring the above 2 parameters can reduce the master-slave data synchronization delay caused by network problems

There are usually two methods for judging master-slave delay:

1. Seconds_behind_master vs 2. Mk-heartbeat, the following specific two in the realization of the difference between the functions.

The

can be determined by monitoring the value of the Seconds_behind_master parameter output of the show slave status\g command to see if there is a master-slave delay occurring. The
has the following values:
NULL-Indicates that either Io_thread or Sql_thread has failed, that is, the running state of the thread is no, not yes.
0-This value is zero, we are extremely eager to see the situation, indicating that the master-slave replication is good, you can think that lag does not exist.
Positive value-Indicates that the master-slave has already been delayed, the larger the number, the more backward the main library from the library.
negative values-almost rarely, just listen to some veteran dba say, actually, this is a bug value that does not support negative values, that is, should not appear.

Seconds_behind_master is a difference that is obtained by comparing the timestamp of the event performed by the Sql_thread with the timestamp (abbreviated TS) of the Io_thread copy good event. We all know the Relay-log and the main library bin-log inside the same content, in the record SQL statement will be recorded at the time of the TS, so the reference value from Binlog, in fact, there is no need to synchronize with NTP, that is, there is no need to ensure the consistency of the master-slave clock. You will also find that the comparison really occurs between Io_thread and Sql_thread, and Io_thread is really associated with the main library, so the problem comes out, when the main library I/O load is large or network congestion, Io_ Thread can not replicate Binlog (no interruption, also in copy), and Sql_thread has always been able to keep up with the Io_thread script, then the value of Seconds_behind_master is 0, that is, we think of no delay, but, actually not, You know. This is why we have to criticize the use of this parameter to monitor the database is not allowed to delay the reason, but this value is not always allowed, if the Io_thread and master network is very good, then this value is also very valuable. (like: Mom-son-daughter-in-law relationship, mother and son relatives, daughter-in-law and son also relatives, not necessarily daughter-in-law and mom is very pro. A joke:-) before, mention seconds_behind_master This parameter will have a negative value appears, we already know that the value is Io_thread recently with the new TS and sql_thread to the TS difference, the former is always greater than the latter, The only possibility is that the TS of an event is wrong, smaller than the previous one, and when this happens, a negative value appears.

Method 2. A tool in the Mk-heartbeat,maatkit Universal Toolkit, which is thought to accurately determine the method of replication delay.

The implementation of Mk-heartbeat is also achieved with the help of Timestmp, which first needs to ensure that the master-slave server must be consistent, by synchronizing the clock with the same NTP server. It needs to create a heartbeat table on the main library with at least the ID and TS two fields, ID server_id,ts is the current timestamp now (), the structure will be copied to the library, the table is built, In the main library in the background process to perform a one-line update operation of the command, periodically to the table to insert data, the cycle default to 1 seconds, and from the library will also execute a monitoring command in the background, with the main library to maintain a consistent cycle to compare, copied to record the TS value and the same TS value on the main library, A difference of 0 means no delay, and the greater the difference, the greater the number of seconds to delay. We all know that replication is asynchronous TS is not completely consistent, so the tool allows for a half-second gap, within which the differences can be ignored as no delay. This tool is through the real deal copy, clever borrow timestamp to check the delay, like one!

Solution One:

http://blog.csdn.net/allen_hdh/article/details/19854783

Influence of master-slave library delay on project quality

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More