After the faulty machine is repaired and restarted, The binlog of the master database is frantically pulled, resulting in a solution to the network problem. The binlog is repaired.

Source: Internet
Author: User

After the faulty machine is repaired and restarted, The binlog of the master database is frantically pulled, resulting in a solution to the network problem. The binlog is repaired.

Problem description:

A week ago, a mysql server was shut down due to hardware faults. We submitted an application to the student responsible for this part. They are responsible for reporting to the student responsible for repairing this server. After the server is repaired today, they start it up. The four mysql instances on the server are automatically started after the instance is started, and the binlog of the master database is pulled. Because the server has been down for a long time, many logs are lost, and the binlog of the master database is pulled, the network of the master database becomes faulty.
Symptom:
First of all, we didn't realize it was caused by a bad server restarting and pulling the binlog of the master database, because we didn't know what the server was like, just one week ago, we have repaired one server. We have no idea about the specific situation, whether it has been repaired, or whether it has been turned on. In this case, I suddenly heard the network staff say that the network traffic of a mysql machine is too large, resulting in a very slow business experience, lasting for 17 minutes in total. In fact, there is no clue.
Troubleshooting:
No problems were found in processlist, full logs, and slow logs.
Check Monitoring and find that the server's read IO suddenly increased during that time. By viewing the processlist history, we found that for a period of time, the user status of master-slave replication was waiting for net, and its IP address found that the server was a server that crashed one week ago.
Conclusion:There are four instances on this server. After the server is started, the mysql instance starts automatically and starts to pull the binlog to the master database. The daily binlog volume of each master database is about 6 GB, four instances have a binlog of about 160 GB in a week.
Problem:1. When will the faulty server be repaired or when it will be started? We cannot control or know, I did not pay attention to it. 2. This case is actually a very simple and typical case that may cause an impact or failure. We are not aware of this phenomenon in advance, although we know this is a very easy problem, in our case, we have no idea about this. Therefore, this event occurs. 3. There is no effective monitoring for network traffic.
Solution:1. Stop all servers from starting mysql automatically. After the server is started, manually start the instance and stop slave. (In this way, if there are a lot of servers, it may be too troublesome. For the time being, record them first, which is more influential.) 2. Be aware of this problem, include the problem in the knowledge library or work manual to avoid the problem.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.