Do MySQL master and slave will certainly encounter a lot of synchronization problems, most of them are due to machine downtime, restart, or the primary key conflict caused by the server to stop work, here specifically to collect similar problems and provide finishing solutions, only for reference!
1, master and slave network interruption, or the primary server reboot, or reboot from the server, will be based on the time in the profile (default 1 minutes) to automatically reconnect the primary server, until the network and services can be properly connected to the normal connection can automatically continue to sync before the file, do not need any manual intervention!
2, when the master and slave because of man-made reasons for the synchronization, you can use the following command to synchronize:
The code is as follows |
Copy Code |
LOAD DATA from MASTER; LOAD TABLE tblname from MASTER; |
Note that the above command will lock the primary database, if the database is very large, it is recommended for downtime, or a short lock backup to view show master status; After copying the database in a way.
3, when the Bin-log inside the SQL level error causes master and subordinate can not sync, you can use the following method to skim the Error statement line, continue to sync:
The code is as follows |
Copy Code |
Stop slave; Set global sql_slave_skip_counter=1; Start slave; |
4. When set global sql_slave_skip_counter=1; Yes, there may be a mistake.
ERROR 1858 (HY000): Sql_slave_skip_counter can not being set when the "server is" running with Gtid_mode = on. Instead, for each transaction which you want to skip, generate a empty transaction with the same as the Gtid
The reason is clear. databases that do not support Gtid_mode mode running
What about it?
Let's talk about the Gtid mode of master-slave error skipping method
No more words. Directly on the method, executed in order
First determine the Gtid point, which is the point of synchronization errors recorded, the following method, you must first log in to MySQL before viewing
The code is as follows |
Copy Code |
Mysql> show Slave statusg;
|
Take a look at the information and record it.
executed_gtid_set:7f8d9eb8-a7fe-11e2-84fd-0015177c251e:1-260
Next, reset the master and slave on the slave.
Note:
(note here that the master and slave from the server, if the primary master copy would be cumbersome)
(note here that reset Master will cause all slave resets on this slave, the main purpose of Reset Master is to make gtid_executed empty.) This cannot be done simply by changing master to, which does not appear to be an error, but in fact slave is not updated, and the server references the Executed_gtid_set parameter in the show slave statusg for data.)
The code is as follows |
Copy Code |
mysql> Reset Master; Query OK, 0 rows affected (0.20 sec) mysql> stop Slave; Query OK, 0 rows affected (0.05 sec) mysql> Reset Slave; Query OK, 0 rows affected (0.42 sec) |
Now we need to reset the Gtid to skip the wrong message. Remember the executed_gtid_set we recorded in the first step? Yes, it's the wrong way to do it, so be conservative and skip this one, plus 1 on its ID.
The code is as follows |
Copy Code |
mysql> set global gtid_purged= ' 7f8d9eb8-a7fe-11e2-84fd-0015177c251e:1-261′; Query OK, 0 rows affected (0.18 sec) |
Since we have just reset master and slave, we need to change master again:
The code is as follows |
Copy Code |
Change MASTER to Master_host= ' 192.168.1.136′, master_port=3306, master_user= ' dbadmin ', master_password= ' 123456′, Master_auto_position=1; |
and restart Slave.
The code is as follows |
Copy Code |
Start slave; show slave statusg; |
What do you think? Is the problem solved? What the? Also reported wrong? Then you carefully look at the error is not the same as the previous one? To prove that you have skipped the previous error, you need to do is to continue to repeat the above operation, until you skip all the wrong me, do not bother, after all, the data is very important Oh!
Synchronous replication Error
In the afternoon a master three from the MySQL replication, resulting in all the servers are configured, found from the following error
Last_io_error:fatal error:the slave I/O thread stops because master and slave have MySQL server IDs; These IDs must is different for replication to work (or the--replicate-same-server-id option must is used on slave but th Is does to always make sense; Please check the manual before using it).
The meaning is from the server_id on the same as the Lord, through the view found from the/etc/my.cnf in the server_id=1 this line I did not comment out (in the copy section below I set the server_id), so immediately the line commented out, and then restart MySQL, Find the same mistakes.
Use the following command to view the server_id
The code is as follows |
Copy Code |
Mysql> Show variables like ' server_id '; +---------------+-------+ | variable_name | Value | +---------------+-------+ | server_id | 1 | +---------------+-------+ 1 row in Set (0.00 sec) |
Found that MySQL did not update server_id from the my.cnf file, since this can only be manually modified
The code is as follows |
Copy Code |
mysql> set global server_id=2; #此处的数值和my the same as in CNF. mysql> slave start; |
After this execution, slave returned to normal.
But later, mosquitoes use/etc/init.d/mysqld restart restart the MySQL service, and then look at the slave state, found that the above error, and then see server_id found that the value returned to 1.
After the mosquito again looked at the contents of the/ETC/MY.CNF, to confirm that it should not be the problem of this file, so go to Google to check, see MySQL in the startup will look for/ETC/MY.CNF, datadir/my.cnf,user_home/ My.cnf.
So I carried out
The code is as follows |
Copy Code |
Find/-name "MY.CNF"
|
Incredibly in/usr/local/mysql this directory found my.cnf file, so the mosquito will delete this file, and then restart the MySQL service, found that everything back to normal
Some error handling and routine maintenance
Check that the show slave status command is generally used from the server
code is as follows |
copy code |
mysql> Show SLAVE STATUSG *************************** 1. Row *************************** Slave_io_state:waiting for Master to send event master_host:192.168.0.100 master_user:root master_port:3306 Connect_retry:3 & nbsp master_log_file:mysql-bin.003 read_master_log_pos:79 relay_log_file:mysql-relay-bin. 003 Relay_ log_pos:548 Relay_master_log_file:mysql-bin. 003 Slave_io_running:yes Slave_sql_running:yes replicate_do_db: replicate_ignore_db: last_errno:0 ... |
In the information above, our main concern is slave_io_running and slave_sql_running.
Slave_io_running: Reading the Binlog log from the server from the primary server and writing the relay log from the server
Slave_sql_running: The process is reading the Binlog relay log from the server and converting it to SQL execution
Previously there was a process that was no State, indicating that the replicated process stopped, and what would be seen in Last_errno
Sometimes because the main server update is too frequent, resulting in a slow update from the server, of course, the problem is varied, there may be the structure of the network is not good or poor performance of the hardware, so that the gap between the master and the server is growing, and eventually some applications have been affected, in this case We need regular data synchronization of the master-slave server, the following steps
On the primary server
The code is as follows |
Copy Code |
Mysql> FLUSH TABLES with READ LOCK; Query OK, 0 rows affected (0.03 sec) Mysql> Show Master STATUSG; 1. Row *************************** file:mysql-bin.000004 position:102 binlog_do_db: binlog_ignore_db: 1 row in Set (0.00 sec) |
Record the name and offset of the log, which is the destination for replication from the server
From the server, use the master_pos_wait () function to get the copy coordinate value
code is as follows |
copy code |
mysql> Select master_pos_wait (' mysql-bin.000004 ', ' 102 '); +-------------------------------------------+ | master_pos_wait (' mysql-bin.000004 ', ' 102 ') | +--------- ----------------------------------+ | 0 | +-------------------------------------------+ 1 row in Set (0.00 sec) |
This SELECT statement blocks until the specified log file and offset are reached from the server, and 0 is returned, if-1, the timeout is rolled out, and the query is 0 o'clock, indicating that the server is synchronized with the primary server
In some cases, a failure to update from the server will first need to determine whether the table from the server is different from the primary server, and if it is the result of a table structure, you need to modify the table from the server and the primary server, and then rerun start slave
If it is not a different table structure caused by the failure of the update, you need to confirm that the manual update is safe, and then ignore to the Autonomic server update failure statement, jump over to the autonomous server statement, command for the set GLOBAL sql_slave_skip_counter=n, which, n= 1 means that the UPDATE statement to the autonomic server does not use Auto_increment or last_insert_id (), n=2 when it is otherwise, because the statement using Auto_increment or last_insert_id needs to obtain two events from the binary log.