Problem grooming in MySQL master-slave synchronization

Source: Internet
Author: User
Tags mysql version

Prior to the deployment of the MySQL master-slave replication environment (MySQL replication environment (master/Slave) deployment summary grooming), during the MySQL synchronization process, there are many problems, resulting in data synchronization anomalies.
The following are some of the problems that may exist in the master-slave synchronization:
1) Slave run too slow can not synchronize with master, that is, MySQL database master-slave synchronization delay
MySQL database slave Server latency is very common, MySQL replication allows a select operation from the machine, but in the actual online environment, because of the slave delay relationship, it is difficult to turn the read operation to the slave. This leads to the following unspoken rules:"read operations with low real-time requirements can be placed on the slave server, with high real-time read operations placed on the master server", "from the machine can only do the day before the statistical class query."
Slave lag that slave cannot quickly execute all events from master, thus preventing updates to slave data latency.
In MySQL's master-slave architecture, master only writes, updates, deletes, and slave does a select operation. There are many reasons for slave lag.
slave The principle of synchronous delay
MySQL's master-slave replication is a single-threaded operation, and the main library writes into Binlog for all the DDL and DML-generated logs, and is highly efficient because binlog is sequential.
The slave IO thread thread reads the log from the bin log in the main library.
The slave SQL thread thread replays the DDL and DML operation events of the main library in slave. The IO operations of DML and DDL are immediate, not sequential, and much more expensive.
Because SQL thread is also single-threaded, if other queries on slave produce lock contention, or if a DML statement (large transaction, large query) executes for a few minutes, all subsequent DML waits for the DML to execute before it continues execution, which results in a delay.
slave possible causes of synchronization delays
The 1--slave I/O thread defers reading the event information in the log; the most common reason is that slave executes all transactions in a single thread, while master has many threads that can execute transactions in parallel.
2--provides long queries for inefficient connections, disk read I/O limits, lock contention, and InnoDB thread synchronization startup.
3--master load; Slave load
4--Network Latency
5--machine configuration (CPU, memory, hard disk)
In summary, when the main library is high in concurrency, the number of DML generated exceeds the speed that slave's SQL thread can handle, or the delay occurs when a large query statement in slave generates a lock wait.
How to view synchronization delays
Can pass the log location on the master, slave
2--the value of Seconds_behind_master by "show slave Status", which represents the time of the master-slave synchronization delay, the larger the value, the more severe the delay. A value of 0 is normal, and positive values indicate that there has been a delay, and the larger the number, the more the main library falls behind the library.
3--is viewed using Percona-toolkit's pt-hearbeat tool.
action plan to reduce synchronization delay
To reduce lock competition
If the query results in a large number of table locks, consider refactoring the query statement and try to avoid too many locks.
2--Load Balancing
Building up how many slave, and using LVS or Nginx for query load balancing, can reduce the number and time of each slave execution query, thus more time is spent processing master-slave synchronization.
3--salve Higher machine configuration
4--slave Adjusting Parameters
To ensure high data security, configure settings such as Sync_binlog=1,innodb_flush_log_at_trx_commit=1. While slave can be turned off binlog,innodb_flush_log_at_trx_commit can also be set to zero to improve SQL execution efficiency
5--Parallel Replication
That is, single-threaded replication is changed to multi-threaded replication.
There are two threads associated with replication from the library: Io_thread is responsible for taking binlog from the main library and writing to Relaylog, Sql_thread is responsible for reading Relaylog and executing.
The idea of multithreading is to turn Sql_thread into a distribution thread, which is then executed by a set of Worker_thread.
Almost all parallel replication is this way of thinking, and there are different, sql_thread distribution strategies.
MySQL5.7 's true parallel replication enhanced multi-threaded Slave (MTS) solves the latency problem of master-slave synchronous replication.

2) Slave_io_running:no appears in slave sync state
Error: Last_io_error:got fatal error 1236 from master If reading data from binary log: ' Could not ' find first log file name In binary log index file '
Cause: Cleaning up the data causes the master and slave libraries to be out of sync.
Resolution: Http://www.cnblogs.com/kevingrace/p/6256603.html

3) slave_io_running:connecting appears in slave sync state
The cause of this error is generally:
Network does not pass
2--Permissions issue (Connection Master's username and password are inconsistent with Master authorization)
Log file and pos node used for 3--connection are inconsistent with the results of "Show Master Status"

4) Slave synchronization status in the presence of Slave_sql_running:no, that is, slave different steps!
Workaround:
the first method: After ignoring the error, continue syncing.
This method is applicable to the case that the master-slave database data is not small, or the data can not be completely unified, the data requirements are not strict (below are the operations on the slave machine)
mysql> stop Slave;
mysql> Set Global sql_slave_skip_counter = 1; Indicates a skip step error, the following number is variable, or add slave-skip-errors = All (above has been added to the configuration) in MY.CNF
mysql> start slave;
Mysql> Show slave status\g View:
Second method: Re-master from, fully synchronized
This method is suitable for the case that the master-slave database data is large or the data is completely unified.
1--master operations on the main library
Mysql> flush tables with read lock; Locks the table to prevent data writes. Note that the position is locked as read-only, and the statement is case insensitive
# mysqldump-uroot-p-hlocalhost > Mysql.bak.sql//Backup data to Mysql.bak.sql file, note that database backups must be done regularly to ensure data is foolproof
Mysql> Show master status; Check the master status, note the log file and POS node, and the slave synchronization will use
# SCP Mysql.bak.sql [email protected]:/tmp///upload the backup file to slave from the library machine for data recovery
2--slave operation from Library
mysql> stop Slave;
Mysql> Source/tmp/mysql.bak.sql
mysql> Change Master to Master_host = ' 192.168.1.101 ', master_user = ' slave ', master_port=3306 ...;
mysql> start slave;
Mysql> Show Slave Status\g
.......
Slave_io_running:yes
Slave_sql_running:yes

5) Slave trunk log relay-log corrupted?
What is a trunk log?
Relay-log is stored on the slave server, copies the primary server's binary log files from the server to its own host in the trunk log, and then calls the SQL thread to follow the binary log files in the copy relay log file so that data synchronization can be achieved.
How to relay logs to avoid:
After MySQL version 5.6, relay_log_recover=1 can be avoided in the my.cnf file.

6) Slave connection timeout and reconnect frequently
If there are many slave, and there is no setting of server_id or two slave setting the same server_id, there is a possibility that the server ID conflict will occur. In this case, one of the slave may be frequently timed out or lost after reconnecting the sequence.
So make sure that each slave and master sets a different server_id in the my.cnf.

7) The main library is out of sync with a different storage engine from the library

8) When synchronizing from the library, the prompt table does not exist
Error: Last_error:error executing row event: ' Table ' test.t1 ' doesn ' t exist '
WORKAROUND: Rebuild the table from the library.

9) Max_allowed_packet set too small cause slave error
Max_allowed_packet default is 16M, the Max_allowed_packet value of the master and slave libraries and the mismatch on the standby.
In this case, the main library may record a package that is considered too large by the repository. When the repository obtains the binary log event, it may encounter various problems, such as infinite error and retry, trunk log corruption, etc.
Specific performance:
Slave_io_thread dead from the library, after viewing, the following error message appears:
Got a packet bigger than ' Max_allowed_packet ' bytes
Obviously due to the max_allowed_packet setting is too small, and then check the settings on the master-slave library, the main library is set larger than the library, because Max_allowed_packet is a dynamic parameter, first adjust the max_allowed_packet from the library As with the main library, restarting the I/O thread separately is normal.
Rationale: Binlog events are recorded in RBR format, and the current event length is greater than the max_allowed_packet from the library, resulting in the inability to slave io to read the master Binlog event correctly.

10) A failure occurred while deleting a record on Master
After deleting a record on master, the slave error is not found on this record.
Workaround:
You can skip this statement because it has been deleted on the main library.
In this case, it is explained that the master-slave synchronization may have inconsistent data, so it is necessary to use Pt-table-checksum for database consistency.
(Refer to: MySQL master-slave synchronization data Consistency check tool Pt-table-checksum usage grooming)

11) Update a record in master, but slave cannot find it.
Master has a record when the master data is not available, but there is no record on salve, if the record is updated on master, an error may be found in slave.
Workaround:
Take a look at the binary logs on the main library based on the location of the exception from the library.
2--based on the main repository binary log information, find the entire record after the update.
3--performs an insert operation by executing the record information found on the main library from the library.
4--skips this statement, and then synchronizes the slave.
5--use Pt-table-checksum to see whether the Master-slave library table data is consistent.

Problem grooming in MySQL master-slave synchronization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.