What should I do after the server load surge?

Last Update:2013-12-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I was writing a speech ppt. I suddenly heard the text message and thought it was an advertisement. Just put down your mind, and text messages keep ringing. You don't have to think about which server should trigger an alarm.

When I opened the nagios monitoring meeting, I found that three servers, three servers, were in the same cluster, the business was a forum, and the number of online users was about 40 thousand.) the load was too high and it was in the warning status.

1. Check the access traffic first. The comparison is no different from the previous one.

2. Check the number of processes and cpu usage of each server, which is no different from the previous one.

3. view system logs. Each server has "TCP: Treason uncloaked! Peer 113.247.241.146: 21345/80 shrinks window 2128147967: 2128149427. retries red ."

4. View php logs. A large number of "[WARNING] fpm_request_check_timed_out (), line 158: child 25379, script'/mnt/html/bbs/forum. php '(pool default) execution timed out (120.306361 sec), terminating ". It took more than 120 seconds to open the Forum homepage. The execution interruption time set in the php configuration file is 120 seconds. If this value is exceeded, the sub-process is disabled. It seems that we should start from here.

First, ask someone else if I have changed the program recently. Is there any plug-in added? A: "No ". I checked the system carefully:

1) check whether the file system is damaged and cannot be written.

2) check whether the partition is full. If the partition is actually full, an SMS will trigger an alarm)

3) Check the tcp connection status. It seems that it is not a system problem.

Then, there are associated databases, nfs file systems, and memchached. Check whether it is easy! Check nfs first. Check memcached again. It seems that there is something wrong with the database.

Log on to the database and check the database error log. Run tail-f to scroll down the output. It seems that the problem has been found. The input content mainly includes the following lines:

[ERROR] Got error 134 when reading table './uc_mumayi/cdb_uc_members'

[ERROR] Got error 134 when reading table './uc_mumayi_net/cdb_uc_members'

[ERROR]/usr/local/mysql/libexec/mysqld: The table 'pre _ common_session 'is full

Next, starting from processing the TABLE full, set its row value to a greater value. I set the value to 10 million and the command is: mysql> ALTER TABLE pre_common_session MAX_ROWS = 10000000; the load on the three web servers immediately drops. The error message indicates that two tables may be damaged. Check it. If it is broken, fix it!

1) check the first table: mysql> check table cdb_uc_notelist; the output is

+---------------------------+-------+----------+-----------------------------------------------------------+| Table                     | Op    | Msg_type | Msg_text                                                  |+---------------------------+-------+----------+-----------------------------------------------------------+| uc_mumayi.cdb_uc_notelist | check | warning | 11 clients are using or haven't closed the table properly || uc_mumayi.cdb_uc_notelist | check | warning | Size of datafile is: 260372       Should be: 259760       || uc_mumayi.cdb_uc_notelist | check | error    | Wrong bytesec: 101-114-110 at linkstart: 258412           || uc_mumayi.cdb_uc_notelist | check | error    | Corrupt                                                   |+---------------------------+-------+----------+-----------------------------------------------------------+4 rows in set (0.04 sec)

If the damage is serious, fix it:

Mysql> repair table cdb_uc_notelist;

The output is

+---------------------------+--------+----------+-----------------------------------------------+| Table                     | Op     | Msg_type | Msg_text                                      |+---------------------------+--------+----------+-----------------------------------------------+| uc_mumayi.cdb_uc_notelist | repair | info     | Wrong bytesec: 101-114-110 at 258412; Skipped || uc_mumayi.cdb_uc_notelist | repair | warning | Number of rows changed from 5715 to 5742      || uc_mumayi.cdb_uc_notelist | repair | status   | OK                                            |+---------------------------+--------+----------+-----------------------------------------------+

2) Restore 2nd tables. The method is the same as above.

3) Check the status again.

4) Ask the Administrator to log on from the background and check whether the operation is normal.

Original article: http:// B .formyz.org/2011/1124/53.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

What should I do after the server load surge?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

What should I do after the server load surge?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support