Troubleshooting guide for medium and high loads on Centos servers

Source: Internet
Author: User
Tags apache log cpanel server memory

Technical support analysts often complain about high server loads. In fact, cPanel software and its installed applications rarely cause high server load. The server owner, system administrator, or server supplier should conduct a preliminary investigation on the high load status, and seek help from analysts after confirming that the situation is complex.

Why is the high server load?

Excessive use of the following items will directly cause high load problems:

  • CPU
  • Memory (including virtual memory)
  • Disk I/O

How do I check these items?

This depends on whether you want to review the current resource usage or historical resource usage. Of course, we will discuss these two aspects in this article.

A brief description of sar

You can use the sar tool to view historical resource usage. By default, the tool should be installed on all cPanel servers using the sysstat software package. As long as sysstat is periodically executed using the cron command (/etc/cron. d/sysstat), the running status data of the server is collected. If cron is not running, sysstat cannot collect historical statistics.

To view historical resource usage in sar, we must provide a path that matches the statistics.

For example, if you want to view the average server load status since 23rd day of this month, run the following command:

Code:

 
[user@host ~]$ sar -q -f /var/log/sa/sa23 

-Q in the preceding command is used to obtain the average load information, while-f is used to specify the file from which the sar obtains information. Please note that sar may not be able to use the running information of a week or even earlier.

If you want to view the statistics of the current date, you do not have to specify the time for the command. Run the following command to display the average load of today:

Code:

 
[user@host ~]$ sar -q 

We strongly recommend that you read the sar instructions:

Code:

 
[user@host ~]$ man sar 

The statistical information provided by this tool can help us grasp the running status of servers.

Current CPU usage

Run top and check the percentage of idle Cpu displayed in the % id section in the CPU (s) line. The higher the number, the better the result, indicating that the CPU workload is not strong. The CPU in the 99% Idle State barely processes any actual task, while the CPU in the 1% Idle State means nearly full load.

Code:

 
[user@host ~]$ top c 

Tip: You can add P to classify processes based on the number of CPU resources consumed.

CPU usage history

View the "% idle" column:

Code:

 

[user@host ~]$ sar -p 


Current memory usage

Code:

 
[user@host ~]$ free -m 

Tip: Run top c and add M to check which process occupies the maximum amount of memory.

Memory usage history

The command content varies depending on the sar version. In earlier versions, the "-r" parameter is added to display the memory usage percentage and virtual memory usage percentage. In the new version, the "-s" parameter is used to display the virtual memory usage percentage.

Check % memused and % swpused:

Code:

 
[user@host ~]$ sar -r 

Or:

Code:

[user@host ~]$ sar -r 

 

Code:

 
[user@host ~]$ sar -S 

Memory usage prompt: the server memory usage is normal. This is because the read/write speed and efficiency of the memory is much higher than that of the server disk. Therefore, the operating system tends to use the memory as a buffer mechanism to pre-load data, thus improving the data reading speed.

Similarly, memory usage percentage is not a big problem (unless you have not set a virtual memory partition, but it is also irrelevant to the memory itself ). What you really need to pay attention to is the percentage of virtual memory usage, because the virtual memory will take over only when the physical memory of the server is fully occupied. The lower the value, the better the running status of the server. If the virtual memory usage is 0%, it means that our server can use the physical memory to execute tasks completely.

How much is the virtual memory usage too high? This depends on how you feel. In general, if the virtual memory usage remains low, the running status of our servers is still ideal. If you find that the virtual memory usage increases over time (for example, from 1% to 7% and then to 32%), this means that some processes on the server are consuming the memory, we need to investigate the situation in time (instead of directly installing more memory ). Once the server has exhausted all the physical and virtual memory, the entire system will become extremely slow and must be restarted to temporarily resume normal operation.

Current disk I/O usage

Note: This item does not work for OpenVZ/container ozzo containers.

The following command displays ten consecutive disk usage statistics once per second. Please pay attention to the % util column in the displayed result:

Code:

 

[user@host ~]$ iostat -x 1 10 


Historical disk I/O usage

Code:

 
[user@host ~]$ sar -d 

Excellent system administrators can accurately grasp the server load baseline and make judgments immediately when the current load exceeds the base. The main purpose of doing so (except to prevent the server from being half paralyzed and had to be restarted) is to timely understand what projects the server is running when the load is high. Quick response helps you troubleshoot problems immediately after they are detected.

If the server load is too high between two to four in the morning, we will not be able to immediately investigate when we are asleep. Although sar will always be around the server and help us to find out which resources remain in high usage during this period, it cannot reveal the actual causes of the problem. There are various causes of high load, these include DoS attacks, spam attacks, improper php script design, network spider being too active when drawing the Network Map, hardware faults, and soaring disk write volume for the user's MySQL database.

The good news is that you can use tools to collect this information and automatically send the results when the load is too high. How to implement it? Start with the Process List:

Code:

 
[user@host ~]$ ps auxwwwf 

I have created a shell script based on a set of perl scripts on the server I have managed. This script works with other server monitoring tools (such as Nagios) to facilitate my work. It checks six different items (which will be detailed below) and sends me an email notification when the entries in the Process List exceed the threshold.

Note: cPanel is not responsible for the development, maintenance or technical support of the script. Do not apply for a service for this script. If you have any questions during use, please post on the relevant forum or consult an experienced system administrator. CPanel does not provide any support related to this script.

The resource objects it checks are as follows:

  • Average load per minute
  • Virtual memory usage (unit: KB)
  • Memory usage (unit: KB)
  • Number of packets received per second
  • Number of packets sent per second
  • Process count

How to use scripts

To automatically run this script, you need to set a cron task and set the Running frequency according to the actual situation. I found that running every five minutes is a good choice. This script does not need to run as root, so we do not need to assign high permissions to it.

If one of the preceding monitored resource objects exceeds the custom threshold, the script automatically sends an email containing the list of current processes.

The subject line of the email is as follows:

Code:

 
server.example.com [L: 35] [P: 237] [Swap Use: 1% ] [pps in: 54  pps out: 289] 

Here we will explain the entries one by one:

  • L represents the average load per minute
  • P indicates the number of processes in the current process list.
  • Swap Usage indicates the percentage of virtual memory used
  • Pps in indicates the number of packets received per second.
  • Pps out indicates the number of packets sent per second.
  • Precautions before using the script

    Important: Adjust the values in the script based on your understanding. The perfect default value does not exist, because different server environments should follow different standards in actual operation. For example, servers with sixteen CPU cores certainly have a higher average load per minute than servers with only one CPU core.

    Note: You need to add your EMAIL address to the EMAIL variable as follows:

    Code:

     
     

    The following five items also need to be adjusted according to the actual situation:

    • MAX_LOAD
    • MAX_SWAP_USED
    • MAX_MEM_USED
    • MAX_PPS_OUT
    • MAX_PPS_IN
EMAIL=you@example.com 

Code :#! /Bin/sh export PATH =/bin: /usr/bin #################################### ######################################## # Copyright Jeff Petersen, 2009-2013 #### This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or ## (at your option) any late R version. #### This program is distributed in the hope that it will be useful, # but without any warranty; without even the implied warranty of # MERCHANTABILITY or fitness for a particle PURPOSE. see the # GNU General Public License for more details. #### You shoshould have got Ed a copy of the GNU General Public License # along with this program. if not, see 

It should be noted that the output content of the Process List contains some useful series, involving the CPU and memory usage of each process:

  • % CPU
  • % MEM
  • VSZ
  • RSS
  • TIME (display the TIME when a process exists)

We can analyze the causes of high server load in multiple ways. Below we list several common solutions-for reference only, not comprehensive:

  • Use mysqladmin processlist (or 'mysqladmin Pr') to check the MySQL process list
  • Use mytop to check the MySQL process list
  • View log files. It is also important to understand the server's feedback. Is your server vulnerable to brute-force cracking?
  • Run dmesg to check possible hardware faults
  • Use netstat to view server connections

Below are the log files worth attention and their storage paths:

  • System log:/var/log/messages,/var/log/secure
  • SMTP log:/var/log/exim_mainlog,/var/log/exim_rejectlog,/var/log/exim_paniclog
  • POP3/IMAP log:/var/log/maillog
  • Apache Log:/usr/local/apache/logs/access_log,/usr/local/apache/logs/error_log,/usr/local/apache/logs/suexec_log, /usr/local/apache/logs/suphp_log
  • Website log:/usr/local/apache/domlogs/(use this to find sites with traffic in the last 60 seconds: find-maxdepth 1-type f-mmin-1 | egrep-v 'offset | _ log $ ')
  • Cron log:/var/log/cron

You can also report your questions, comments to this article, and any other information you wish to share with your friends in the comment bar. As an independent guiding article, we will inevitably have omissions or omissions. We look forward to your valuable comments and hope you will receive some inspiration.


Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.