Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP.
HTTP-based API calls are common for large and medium-sized websites. PHP programmers like to use the simple and convenient file_get_contents ("http://example.com/") function to get the returned content of a URL, but if the http://example.com/this site responds slowly, file_get_contents () it will remain there and will not time out.
We know that in php. in ini, max_execution_time can be used to set the maximum execution time of PHP scripts. However, this parameter does not take effect in php-cgi (php-fpm. The following parameter in the php-fpm.conf configuration file is truly able to control The maximum execution time of PHP scripts: the timeout (in seconds) for serving a single request after which The worker process will be terminated
Shocould be used when 'max _ execution_time 'ini option does not stop script execution for some reason
'0s' means 'off'
0 s
The default value is 0 seconds. that is to say, the PHP script will be executed continuously. In this way, when all php-cgi processes are stuck in the file_get_contents () function, the Nginx + PHP WebServer can no longer process new PHP requests, nginx returns "502 Bad Gateway" to the user ". Modify this parameter to set the maximum execution time of a PHP script. For example, if file_get_contents () is changed to 30 s, and the webpage content is slow to be obtained, this means that 150 php-cgi processes can only process 5 requests per second, webServer is also difficult to avoid "502 Bad Gateway ".
To achieve a thorough solution, can only let PHP programmers get rid of the habit of using file_get_contents ("http://example.com/"), but slightly modify, add a timeout time, use the following methods to implement http get requests. If you are in trouble, you can encapsulate the following code into a function.
The code is as follows:
$ Ctx = stream_context_create (array (
'Http' => array (
'Timeout' => 1 // set a timeout time, in seconds
)
)
);
File_get_contents ("http://example.com/", 0, $ ctx );
?>
Of course, this is not the only reason for the CPU 100% of the php-cgi process. so, how can we determine it is caused by the file_get_contents () function?
First, run the top command to view the php-cgi process with high CPU usage.
The code is as follows:
Top-10:34:18 up 724 days, 3 users, load average: 17.86, 11.16, 7.69
Task: 561 total, 15 running, 546 sleeping, 0 stopped, 0 zombie
Cpu (s): 5.9% us, 4.2% sy, 0.0% ni, 89.4% id, 0.2% wa, 0.0% hi, 0.2% si, 0.0% st
Mem: 8100996 k total, 4320108 k used, 3780888 k free, 772572 k buffers
Swap: 8193108 k total, 50776 k used, 8142332 k free, 412088 k cached
Pid user pr ni virt res shr s % CPU % mem time + COMMAND
10747 www 18 0 360 m 22 m 12 m R 100.6 0: 02. 60 php-cgi
10709 www 16 0 359 m 28 m 17 m R 96.8 0.4. 34 php-cgi
10745 www 18 0 360 m 24 m 14 m R 94.8 0.3. 51 php-cgi
10707 www 18 0 360 m 25 m 14 m S 77.4 0: 33. 48 php-cgi
10782 www 20 0 360 m 26 m 15 m R 75.5 0.3. 93 php-cgi
10708 www 25 0 360 m 22 m 12 m R 69.7 0.3. 16 php-cgi
10683 www 25 0 362 m 28 m 15 m R 54.2 0.4. 65 php-cgi
10711 www 25 0 360 m 25 m 15 m R 52.2 0.3. 25 php-cgi
10688 www 25 0 359 m 25 m 15 m R 38.7 0.3. 44 php-cgi
10719 www 25 0 360 m 26 m 16 m R 7.7 0.3. 59 php-cgi
Find the PID of one of the php-cgi processes with CPU 100% and run the following command to trace the PID:
The code is as follows:
Strace-p 10747
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
The problem is caused by file_get_contents.