Recently, server downtime occurs frequently. When I get off work, G crashes and 502 Bad Gateway Nginx reminds me of the previous 504 Gateway Time-out, the two should be in a certain relationship. Nginx 504 Gateway Time-out means that the requested Gateway does not have a request. Simply put, it does not have a request for PHP-CGI that can be executed. 
 
To solve these two problems, we need to think about it comprehensively. In general, Nginx 502 Bad Gateway is related to the setting of the php-fpm.conf, while Nginx 504 Gateway Time-out is related to the setting of nginx. conf.
 
Nginx 504 Gateway has been recorded in the previous article, this is temporarily ignored, directly say 502 bad gateway solution, the most critical is the php-fpm.conf settings. The php-fpm.conf has two crucial parameters: "max_children" and "request_terminate_timeout", which need to be calculated.
 
If your server has good performance and sufficient bandwidth resources, you can directly set "request_terminate_timeout" to 0 s if the PHP script does not have loops or bugs. 0 s means to keep the PHP-CGI running without a time limit. And if you can't do that, that is, your PHP-CGI may have a BUG, or your bandwidth is not enough, or other reasons that lead to your PHP-CGI can be suspended, then, we recommend that you assign a value to "request_terminate_timeout", which can be set based on the performance of your server. In general, the better the performance, the higher you can set.
 
How is the value of "max_children" calculated? In principle, the larger the value is, the better. If the php-cgi process is too large, it will process very quickly and there will be very few requests in the queue. Set "max_children" based on the server performance. If every php-cgi consumes about 20 mb of data, "max_children" is set to 80, 20 M * 80 = 1600M that is to say in the peak of all PHP-CGI consumption exists within 16, less than the effective memory.
 
If "max_children" is set to a smaller value, for example, 5-10, php-cgi will be "very tired", and the processing speed will be slow and the waiting time will be long. If a request is not processed for a long Time, the error 504 Gateway Time-out occurs, the "very tired" php-cgi that is being processed will encounter the 502 Bad gateway error if it encounters a problem.
 
The following is a more detailed introduction:
 
Some websites running on Nginx sometimes encounter "502 Bad Gateway" errors, sometimes even frequently. The following are some troubleshooting methods for Nginx 502 errors collected by xiaobian for your reference:
 
There are many causes of Nginx 502 errors because of problems with backend servers in proxy mode. These errors are not nginx problems, so you must find the cause from the backend! However, nginx has put all these errors on its own, which makes it highly questionable for nginx promoters. After all, we can understand the word "bad gateway? Isn't it bad nginx? People who do not know it will directly put the responsibility on nginx. I hope the next version of nginx will make the error prompt slightly more friendly, at least it's not a simple 502 Bad Gateway statement, and I don't forget to attach my name to it.
 
Nginx 502 trigger conditions
 
The most common occurrence of 502 errors is that the backend host is used as the machine. In the upstream configuration, there is a configuration: proxy_next_upstream, which specifies what errors nginx will encounter when retrieving data from a backend host to the next backend host, the default value is error timeout. An error occurs when a machine or disconnection occurs, and a timeout is a read congestion timeout, which is easy to understand. I generally write all of them:Copy codeThe Code is as follows: proxy_next_upstream error timeout invalid_header http_500 http_503;
 
However, now I may want to remove the http_500 option. When http_500 specifies that the backend will return a 500 error, it will convert it to a host. If the backend jsp fails, a bunch of stacktrace error messages will be printed, it is now replaced by 502. But programmers in the company do not think so. They think that nginx has encountered an error. I really don't have time to explain the 502 principle to them ......
 
503 error can be retained, because the backend is usually apache resin. If apache crashes, it is error, but resin crashes, it is only 503, so it is necessary to keep it.
 
Solution
 
If you encounter a 502 problem, you can take the following two steps as a priority.
 
1. Check whether the current PHP FastCGI process count is sufficient:Copy codeThe Code is as follows: netstat-anpo | grep "php-cgi" | wc-l
 
If the number of FastCGI processes actually used is close to the preset number of FastCGI processes, it indicates that the number of FastCGI processes is insufficient and needs to be increased.
 
2. If the execution time of some PHP programs exceeds the Nginx waiting time, you can add the FastCGI timeout time in the nginx. conf configuration file, for example:Copy codeThe Code is as follows: http {
Fastcgi_connect_timeout 300;
Fastcgi_send_timeout 300;
Fastcgi_read_timeout 300;
......
}
......
 
An error occurs when memory_limit is set to low in php. ini. After modifying memory_limit of php. ini to 64 MB, restart nginx and check that PHP memory is insufficient.
 
If the problem persists, you can refer to the following solutions:
 
1. max-children and max-requests
 
Nginx php (fpm) xcache is running on one server, with an average traffic volume of around 300 PVS per day.
 
Recently, this situation often occurs: the php page is very slow to open, the cpu usage suddenly drops to a very low level, the system load suddenly rises to a very high level, view the network card traffic, you will also find that suddenly fell to a very low level. In this case, it takes only a few seconds to recover.
 
Check the log file of php-fpm and find some clues.Copy codeCode: Sep 30 08:32:23. 289973 [NOTICE] fpm_unix_init_main (), line 271: getrlimit (nofile): max: 51200, cur: 51200 Sep 30 08:32:23. 290212 [NOTICE] fpm_sockets_init_main (), line 371: using inherited socket fd = 10, "127.0.0.1: 9000" Sep 30 08:32:23. 290342 [NOTICE] fpm_event_init_main (), line 109: libevent: using epoll Sep 30 08:32:23. 296426 [NOTICE] fpm_init (), line 47: fpm is running, pid 30587
 
Before the preceding statements, the children and children logs are disabled for more than 1000 rows.
 
Originally, php-fpm has a max_requests parameter, which specifies the maximum number of requests processed by each children will be disabled. The default value is 500. Because php round-robin requests to every children, in the case of high traffic volumes, each Childe takes almost the same time to reach max_requests, which causes all children to be shut down at the same time.
 
During this period, nginx cannot forward the php file to php-fpm for processing, so the cpu will be reduced to a very low level (no php processing or SQL Execution is required ), however, the load will rise to a very high level (disable and enable children and nginx to wait for php-fpm), and the NIC traffic will also decrease to a very low level (nginx cannot generate data for the client)
 
It is easy to solve the problem. Increase the number of children and set max_requests to a value not 0 or greater:
 
Turn on the/usr/local/php/etc/php-fpm.conf to increase the following two parameters (depending on the actual situation of the server, too big will not work)Copy codeThe Code is as follows: <value name = "max_children"> 5120 </value> <value name = "max_requests"> 600 </value>
 
Restart php-fpm.
 
Ii. Increase the buffer capacity
Open the error log of nginx and find an error message such as "pstream sent too big header while reading response header from upstream. After checking the information, the problem was caused by a bug in the nginx buffer zone. The page consumption on our website may be too large. According to the modification method written by foreigners, the buffer size setting is added, and the 502 problem is completely solved. Later, the system administrator adjusted the parameters and kept only two set parameters: client head buffer and fastcgi buffer size.
 
Iii. request_terminate_timeout
 
If it is common in some post or database operations, rather than static page operations, you can look at one of the php-fpm.conf settings:
 
Request_terminate_timeout
 
The value is max_execution_time, which is the script execution time of fast-cgi.
 
0 s
 
If 0 s is disabled, it is executed infinitely. (When I did not look at it carefully, I changed the number.) After the problem is solved, the execution will not go wrong for a long time. In fastcgi optimization, you can change the value to 5 s to see the effect.
 
If the php-cgi process is insufficient, the php Execution time is long, or the php-cgi process is dead, a 502 error will occur.