Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP. Large and medium-sized websites
Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP.
HTTP-based API calls are common for large and medium-sized websites. PHP programmers like to use the simple and convenient file_get_contents ("http://example.com/") function to get the returned content of a URL, but if the http://example.com/this site responds slowly, file_get_contents () it will remain there and will not time out.
We know that in php. in ini, max_execution_time can be used to set the maximum execution time of PHP scripts. However, this parameter does not take effect in php-cgi (php-fpm. Which of the following parameters in the php-fpm.conf configuration file can really control the maximum execution time of PHP scripts:
- The timeout (in seconds) for serving a single request after which the worker process will be terminated
- Should be used when 'max_execution_time' ini option does not stop script execution for some reason
- '0s' means 'off'
- 0s
The default value is 0 seconds. that is to say, the PHP script will be executed continuously. In this way, when all php-cgi processes are stuck in the file_get_contents () function, the Nginx + PHP WebServer can no longer process new PHP requests, nginx returns "502 Bad Gateway" to the user ". Modify this parameter to set the maximum execution time of a PHP script. For example, if file_get_contents () is changed to 30 s, and the webpage content is slow to be obtained, this means that 150 php-cgi processes can only process 5 requests per second, webServer is also difficult to avoid "502 Bad Gateway ".
To achieve a thorough solution, can only let PHP programmers get rid of the habit of using file_get_contents ("http://example.com/"), but slightly modify, add a timeout time, use the following methods to implement http get requests. If you are in trouble, you can encapsulate the following code into a function.
-
- $ Ctx = stream_context_create (array (
- 'Http' => array (
- 'Timeout' => 1 // set a timeout time, in seconds
- )
- )
- );
- File_get_contents ("http://example.com/", 0, $ ctx );
- ?>
Of course, this is not the only reason for the CPU 100% of the php-cgi process. so, how can we determine it is caused by the file_get_contents () function?
First, run the top command to view the php-cgi process with high CPU usage.
- top - 10:34:18 up 724 days, 21:01, 3 users, load average: 17.86, 11.16, 7.69
- Tasks: 561 total, 15 running, 546 sleeping, 0 stopped, 0 zombie
- Cpu(s): 5.9%us, 4.2%sy, 0.0%ni, 89.4%id, 0.2%wa, 0.0%hi, 0.2%si, 0.0%st
- Mem: 8100996k total, 4320108k used, 3780888k free, 772572k buffers
- Swap: 8193108k total, 50776k used, 8142332k free, 412088k cached
- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
- 10747 www 18 0 360m 22m 12m R 100.6 0.3 0:02.60 php-cgi
- 10709 www 16 0 359m 28m 17m R 96.8 0.4 0:11.34 php-cgi
- 10745 www 18 0 360m 24m 14m R 94.8 0.3 0:39.51 php-cgi
- 10707 www 18 0 360m 25m 14m S 77.4 0.3 0:33.48 php-cgi
- 10782 www 20 0 360m 26m 15m R 75.5 0.3 0:10.93 php-cgi
- 10708 www 25 0 360m 22m 12m R 69.7 0.3 0:45.16 php-cgi
- 10683 www 25 0 362m 28m 15m R 54.2 0.4 0:32.65 php-cgi
- 10711 www 25 0 360m 25m 15m R 52.2 0.3 0:44.25 php-cgi
- 10688 www 25 0 359m 25m 15m R 38.7 0.3 0:10.44 php-cgi
- 10719 www 25 0 360m 26m 16m R 7.7 0.3 0:40.59 php-cgi
Find the PID of one of the php-cgi processes with CPU 100% and run the following command to trace the PID:
- strace -p 10747
If the screen displays:
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
- select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
- poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
The problem is caused by file_get_contents.