Later, I tracked and found that this kind of situation is closely related to the PHP file_get_contents () function.
In large and medium-sized websites, API interface calls based on HTTP protocol are commonplace. PHP programmers like to use the simple and handy file_get_contents ("http://example.com/") function to get the return content of a URL, but if http://example.com/this site responds slowly, file _get_contents () will always be stuck there, and will not timeout.
We know that in php.ini, there is a parameter max_execution_time can set the maximum execution time for a PHP script, but in php-cgi (PHP-FPM), the parameter does not work. The real ability to control the maximum execution time of a PHP script is the following parameters in the php-fpm.conf configuration file: The timeout (in seconds) for serving a, after which the Worke R process would be terminated
Should be used as ' max_execution_time ' ini option does not stop script execution for some reason
' 0s ' means ' off '
<value name= "Request_terminate_timeout" >0s</value>
The default value is 0 seconds, which means that the PHP script will continue to execute. In this way, when all the php-cgi processes are stuck in the file_get_contents () function, the WebServer of this nginx+php is no longer able to process the new PHP request, and Nginx will return "502 bad Gateway" to the user. Modify this parameter to set a PHP script maximum execution time is necessary, but the symptom does not cure the root causes. For example, to 30s, if file_get_contents () to get the content of the Web page is slow, which means that 150 php-cgi process, only 5 requests per second, WebServer also difficult to avoid "502 bad Gateway."
To achieve a thorough solution, PHP programmers can only get rid of the habit of directly using file_get_contents ("http://example.com/"), but modify it slightly, add a time-out, and implement HTTP GET requests in the following ways. If you find yourself in trouble, you can encapsulate the following code as a function.
Copy Code code as follows:
<?php
$ctx = stream_context_create (Array (
' http ' => array (
' Timeout ' => 1//Set a timeout in seconds
)
)
);
File_get_contents ("http://example.com/", 0, $ctx);
?>
Of course, this is not the only reason that led to the php-cgi process CPU 100%, so how do you determine if the file_get_contents () function caused it?
First, use the top command to view the php-cgi process with a high CPU utilization rate.
Copy Code code as follows:
Top-10:34:18 up 724 days, 21:01, 3 users, Load average:17.86, 11.16, 7.69
tasks:561 Total, running, 546 sleeping, 0 stopped, 0 zombie
Cpu (s): 5.9%us, 4.2%sy, 0.0%ni, 89.4%id, 0.2%wa, 0.0%hi, 0.2%si, 0.0%st
mem:8100996k Total, 4320108k used, 3780888k free, 772572k buffers
swap:8193108k Total, 50776k used, 8142332k free, 412088k cached
PID USER PR NI virt RES SHR S%cpu%mem time+ COMMAND
10747 www 0 360m 22m 12m R 100.6 0.3 0:02.60 php-cgi
10709 www 0 359m 28m 17m R 96.8 0.4 0:11.34 php-cgi
10745 www 0 360m 24m 14m R 94.8 0.3 0:39.51 php-cgi
10707 www 0 360m 25m 14m S 77.4 0.3 0:33.48 php-cgi
10782 www 0 360m 26m 15m R 75.5 0.3 0:10.93 php-cgi
10708 www 0 360m 22m 12m R 69.7 0.3 0:45.16 php-cgi
10683 www 0 362m 28m 15m R 54.2 0.4 0:32.65 php-cgi
10711 www 0 360m 25m 15m R 52.2 0.3 0:44.25 php-cgi
10688 www 0 359m 25m 15m R 38.7 0.3 0:10.44 php-cgi
10719 www 0 360m 26m 16m R 7.7 0.3 0:40.59 php-cgi
Find the PID of one of the CPU 100% php-cgi processes to follow the command:
Copy Code code as follows:
Strace-p 10747
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {, 0}) = 1 (out [6], left {15, 0})
Poll ([{fd=6, Events=pollin}], 1, 0) = 0 (Timeout)
Then, it is possible to determine the problem caused by file_get_contents ().