A note point for manual switching of dual-master +haproxy

Source: Internet
Author: User
Tags mysql client haproxy
switch logic that was previously designed

1 query Slve delay, more than n seconds delay wait or return failure to ensure that the business impact time is the shortest

2 Login proxy node, disable current hproxy, make subsequent business connections through proxy fail

3 Login proxy node, shutdown currently connected through proxy session (if SQL can be quickly completed, this step can be used to do not do, you could do a time threshold detection, when n seconds after the business layer connection is Kill)

4 Record the Binlog and POS points of the current main library

5 waiting for the data to be identical between the two-master, that is, after the POS point recorded from the library to step 4 can be implemented using the Master_pos_wait (Log_name,log_pos[,timeout]) built-in function

6 Perform a switch, that is, an alternate node in the Enable M-m

7 Return switch Successful


Failure Phenomenon

After switching, show slave status appears a large number of primary key conflict error


cause Analysis

Through the analysis, found that through the Haproxy to kill the session is not really reliable, when the connection is established, even if through Haproxy kill the session or kill the MySQL client in the process, actually no effect, will continue to execute successfully, If the main library inserts a 10 self ID that is not complete

When the new main library is connected, the slow SQL has not run out, that is, the record with ID 10 is not synchronized.

The new main library inserts a record again, and the resulting ID is still 10.

Slow SQL at this time in the main library execution completed, upload to the new main database key conflict, error scene simulation

1 login MySQL via proxy, execute the following command

Mysql> select * from Test_ha_switch;
Empty Set (0.00 sec)


mysql> INSERT INTO Test_ha_switch (DD) VALUES (sleep (100));

SQL is in operation at this time

2 login proxy, kill session by Socat

echo "Shutdown Sessions Server ha-proxy/10.9.188.208" | Socat Stdio/opt/udb/instance/haproxy/7378229c-7ada-4f90-b1ec-96d704979426/stats

3 At this point go back to the session in step 1 and find that it is already in error 2013 (HY000): Lost connection to MySQL server during query

mysql> INSERT INTO Test_ha_switch (DD) VALUES (sleep (100));
ERROR 2013 (HY000): Lost connection to MySQL server during query

4 Query the table again, found that the record is still written in the

Mysql> select * from Test_ha_switch;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect ...
Connection id:1536352
Current Database:jiang
+----+------+
| ID | DD |
+----+------+
|    1 | 0 |
+----+------+
1 row in Set (0.01 sec)


In addition, if there is no proxy, it is also well simulated by killing the process where the client is located

# mysql-s/opt/udb/instance/mysql-5.6/5770e236-fef6-4d61-bc61-beb6d69f7dbe/mysqld.sock-uucloudbackup jiang-e ' show Processlist; "
+---------+--------------+-------------------+-------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+---------+--------------+-------------------+-------+---------+------+-------+------------------+
| 1536517 | Ucloudbackup | 10.9.125.20:62374 | Jiang |    Sleep |       7 | | NULL |
| 1536566 | Ucloudbackup | localhost | Jiang |    Query | 0 | init | Show Processlist |
+---------+--------------+-------------------+-------+---------+------+-------+------------------+
# nohup Mysql-s/opt/udb/instance/mysql-5.6/5770e236-fef6-4d61-bc61-beb6d69f7dbe/mysqld.sock-uucloudbackup jiang-e "INSERT into Test_ha_switch (DD) select DD from Test_ha_switch;" &
[1] 28752
# nohup:ignoring input and appending output to ' nohup.out '

# Ps-ef|grep Test_ha_switch
Root 28752 17668 0 18:10 pts/0 00:00:00 mysql-s/opt/udb/instance/mysql-5.6/5770e236-fef6-4d61-bc61-beb6d69f7dbe/ mysqld.sock-uucloudbackup-px xxxxxxxx jiang-e INSERT INTO Test_ha_switch (DD) select DD from Test_ha_switch;
Root 29117 17668 0 18:10 pts/0 00:00:00 grep test_ha_switch
# kill-9 28752
[1]+ killed Nohup mysql-s/opt/udb/instance/mysql-5.6/5770e236-fef6-4d61-bc61-beb6d69f7dbe/mysqld.sock -uucloudbackup jiang-e "INSERT into Test_ha_switch (DD) select DD from Test_ha_switch;"
# mysql-s/OPT/UDB/INSTANCE/MYSQL-5.6/5770E236-FEF6-4D61-BC61-BEB6D69F7DBE/MYSQLD.SOCK-UUCLOUDBACKUP-PJFKS9FFXVK Jiang-e "Show processlist;"
+---------+--------------+-------------------+-------+---------+------+--------------+------------------------- -------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+---------+--------------+-------------------+-------+---------+------+--------------+------------------------- -------------------------------------+
| 1536517 | Ucloudbackup | 10.9.125.20:62374 | Jiang |   Sleep |              29 | | NULL |
| 1536570 | Ucloudbackup | localhost | Jiang |   Query | 16 | Sending Data | INSERT into Test_ha_switch (DD) SELECT DD from Test_ha_switch |
| 1536575 | Ucloudbackup | localhost | Jiang |    Query | 0 | init | Show Processlist |
+---------+--------------+-------------------+-------+---------+------+--------------+------------------------- -------------------------------------+


Optimization Method

The optimization method is also relatively simple, that is, in the switching logic between 3 and 4, by direct access to the main node of the way (at this time has been unable to login DB by proxy), if there is a business connection, then kill the current business connection through the MySQL kill syntax;

It is worth mentioning that kill the thread to ensure that the normal exit, this may be the case, there are two connections, when you kill the first connection has been disconnected, then you go to kill the thread may be an error exit, so that the second thread actually did not kill

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.