Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP. Medium and large websites
Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP. Medium and large websites
Sometimes, running Nginx, PHP-CGI (php-fpm) Web service Linux server, suddenly the system load increases, use the top command to view, many php-cgi process CPU usage is close to 100%. Later, I found through tracking that the appearance of such cases is closely related to the file_get_contents () function of PHP.
HTTP-based API calls are common for large and medium-sized websites. PHP programmers like to use the simple and convenient file_get_contents ("http://example.com/") function to get the returned content of a URL, but if the http://example.com/This site responds slowly, file_get_contents () it will remain there and will not time out.
We know that in php. in ini, max_execution_time can be used to set the maximum execution time of PHP scripts. However, this parameter does not take effect in php-cgi (php-fpm. Which of the following parameters in the php-fpm.conf configuration file can really control the maximum execution time of PHP scripts:
- The timeout (in seconds) for serving a single request after which the worker process will be terminated
- Shocould be used when 'max _ execution_time 'ini option does not stop script execution for some reason
- '0s' means 'off'
- 0 s
The default value is 0 seconds. That is to say, the PHP script will be executed continuously. In this way, when all php-cgi processes are stuck in the file_get_contents () function, the Nginx + PHP WebServer can no longer process new PHP requests, nginx returns "502 Bad Gateway" to the user ". Modify this parameter to set the maximum execution time of a PHP script. For example 30 s If file_get_contents () is slow to obtain the webpage content, it means that 150 php-cgi processes can process only 5 requests per second, webServer is also difficult to avoid "502 Bad Gateway ".
To achieve a thorough solution, can only let PHP programmers get rid of the habit of using file_get_contents ("http://example.com/"), but slightly modify, add a timeout time, use the following methods to implement http get requests. If you are in trouble, you can encapsulate the following code into a function.
-
- $ Ctx = stream_context_create (array (
- 'Http' => array (
- 'Timeout' => 1 // set a timeout time, in seconds
- )
- )
- );
- File_get_contents ("http://example.com/", 0, $ ctx );
- ?>
Of course, this is not the only reason for the CPU 100% of the php-cgi process. So, how can we determine it is caused by the file_get_contents () function?
First, run the top command to view the php-cgi process with high CPU usage.
Top-10:34:18 up 724 days, 3 users, load average: 17.86, 11.16, 7.69
Task: 561 total, 15 running, 546 sleeping, 0 stopped, 0 zombie
Cpu (s): 5.9% us, 4.2% sy, 0.0% ni, 89.4% id, 0.2% wa, 0.0% hi, 0.2% si, 0.0% st
Mem: 8100996 k total, 4320108 k used, 3780888 k free, 772572 k buffers
Swap: 8193108 k total, 50776 k used, 8142332 k free, 412088 k cached
Pid user pr ni virt res shr s % CPU % mem time + COMMAND
10747 www 18 0 360 m 22 m 12 m R 100.6 0: 02. 60 php-cgi
10709 www 16 0 359 m 28 m 17 m R 96.8 0.4. 34 php-cgi
10745 www 18 0 360 m 24 m 14 m R 94.8 0.3. 51 php-cgi
10707 www 18 0 360 m 25 m 14 m S 77.4 0: 33. 48 php-cgi
10782 www 20 0 360 m 26 m 15 m R 75.5 0.3. 93 php-cgi
10708 www 25 0 360 m 22 m 12 m R 69.7 0.3. 16 php-cgi
10683 www 25 0 362 m 28 m 15 m R 54.2 0.4. 65 php-cgi
10711 www 25 0 360 m 25 m 15 m R 52.2 0.3. 25 php-cgi
10688 www 25 0 359 m 25 m 15 m R 38.7 0.3. 44 php-cgi
10719 www 25 0 360 m 26 m 16 m R 7.7 0.3. 59 php-cgi
Find the PID of one of the php-cgi processes with CPU 100% and run the following command to trace the PID:
Strace-p 10747
If the screen displays:
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Select (7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0 })
Poll ([{fd = 6, events = POLLIN}], 1, 0) = 0 (Timeout)
Enable 3 s The slow execution log is recorded, and the number of lines of code with slow execution is printed in the log. Php-cgi (php-fpm) uses Libevent, while Libevent uses the epoll I/O model by default to process FastCGI network requests over Linux 2.6 kernel, rather than select/poll. The number of lines of code recorded in the slow log contains file_get_contents and other functions. The select/poll model is used for functions such as file_get_contents as the Client to initiate an HTTP request, that is, only network operation functions such as file_get_contents that meet "TCP requests do not time out by default, use the select/poll model, and process CPU 100%" will result in this situation seen by strace-p.
The problem is caused by file_get_contents.
This article is taken from Zhang Ke's blog