Failover processing process after 48 groups of nodes and a group of db0101 master databases are down _ MySQL

Source: Internet
Author: User
The failover processing process bitsCN.com after the master database of db0101 is down in 48 nodes groups online

Failover processing after 48 nodes in the online group and db0101 master database are down

When a call is received, db0101 is Down and an error is reported:

(1) Error 500,503,504 Error on the application page

(2) email alert db0201 is down now!

1. Initial Determination of symptoms

Ping ipv222.21.173 and report the unreachle error. call the system administrator and hardware engineers to log on to the physical host and check the fault.

Now, go to the mmm control server to check the situation?

[Nova @ db0203 ~] $ Sudo-u mmmd mmm_control show

# Warning: agent on host db1 is not reachable

Db1 (20.222.21.173) master/HARD_OFFLINE. Roles: reader (20.222.22.57), writer (20.222.22.56)

Db2 (connector 222.22.145) master/ONLINE. Roles: reader (connector 222.22.58)

Master/HARD_OFFLINE, probably caused by hardware medium failure.

2. urgent failover to restore the application

Because the application page reports an error and db0201 is down, you need to perform the failover operation immediately and switch it to db0202 as soon as possible. The following is a manual switch.

[Nova @ db0203 ~] $ Sudo-u mmmd/usr/sbin/mmm_control move_role writer db2

OK: Role 'write' has been moved from 'db1' to 'db2 '. Now you can wait some time and check new roles info!

[Nova @ db0203 ~] $ Sudo-u mmmd mmm_control show

# Warning: agent on host db1 is not reachable

Db1 (20.222.21.173) master/HARD_OFFLINE. Roles: reader (20.222.22.57)

Db2 (connector 222.22.145) master/ONLINE. Roles: reader (connector 222.22.58), writer (connector 222.22.56)

It is OK. You can see that you have switched to db0202, writer has pointed to db0202, and no error is reported on the page. log on to db0202 and execute show full processlist; more than 500 client connections are displayed, indicating that the application has been switched to db0202.

3. are you confused about making a new failover?

What should I do before failover? Do you need to wait? Or can I simply execute failover? This is an online operation. I can't use it for reference. here I am directly executing the failover operation.

Execution time: 18: 45

Run the following command: sudo-u mmmd/usr/sbin/mmm_control move_role writer db2.

After an hour, sa and hard engineer have checked the physical host, which is out of memory. by default, they kill the mysql virtual machine with the largest memory in the war. They adjusted the parameter settings and protection measures (the details are not too well understood)

4. set db1 online

After the db0201 server is started, you need to manually enable replication and manually execute start slave; replication starts data synchronization normally. Check the mmm status again.

[Nova @ db0203 ~] $ Sudo-u mmmd mmm_control show

Db1 (20.222.21.173) master/AWAITING_RECOVERY.Roles: reader (20.222.22.57)

Db2 (connector 222.22.145) master/ONLINE. Roles: reader (connector 222.22.58), writer (connector 222.22.56)

Do not panic when we see this awaiting_recovery. this is because of a media fault. although mmm_control has monitored db1, it does not set db1 to online. we need to determine whether db1 is normal, if it is normal, we can set db1 to online by ourselves, which is also a cautious place for mmm. So after I check db1 and find that replication of db1 is normal, we can set db1 online.

Run the following command: sudo-u mmmd mmm_control set_online db1.

Db1 (20.222.21.173) master/ONLINE. Roles: reader (20.222.22.57), OK, db1 is online

5 Change writer from db2 to db1

Check that db1 and db2 dual master run for a period of time. after about 20 minutes, you can perform the switchover operation. after all, db1 is an ssd and db2 is a common medium.

[Nova @ db0203 ~] $ Date

Thu Sep 5 12:11:02 GMT 2013

[Nova @ db0203 ~] $ Sudo-u mmmd/usr/sbin/mmm_control move_role writer db1

OK: Role 'write' has been moved from 'db2 'to 'db1'. Now you can wait some time and check new roles info!

[Nova @ db0203 ~] $ Sudo-u mmmd mmm_control show

Db1 (20.222.21.173) master/ONLINE. Roles: reader (20.222.22.57), writer (20.222.22.56)

Db2 (connector 222.22.145) master/ONLINE. Roles: reader (connector 222.22.58)

We can see that db1 has become a writer.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.