If a 502 Bad GateWay error occurs in nginx, the program may fail or the response may be slow. However, sometimes Nginx itself has problems. For example, after nginx is restarted, all accesses are 502 errors. The error log contains a large number of no live upstream logs.
We used to misuse the check upstream module. Let's take a look at my configuration.
Main configuration file, nignx. conf:
Worker_processes 4;
Pid logs/nginx. pid;
Error_log/tmp/logs/error. log;
Events {
Worker_connections 1024;
}
Http {
Include mime. types;
Default_type application/octet-stream;
# Omitted configuration
Include upstreamA. conf;
Server {
Listen 8889;
Server_name localhost;
Access_log/tmp/logs/m-access.log main;
Error_log/tmp/logs/m-error.log;
Location ~ /{
Proxy_pass http: // tornado_servers;
}
}
}
Then upstreamA. conf:
Upstream tornado_servers {
Server 127.0.0.1: 29100 weight = 100;
Server 127.0.0.1: 29101 weight = 100;
Check interval = 30000 rise = 3 fall = 3 timeout = 30000 type = tcp;
}
Then upstreamB. conf:
Upstream tornado_servers {
Server 127.0.0.1: 29110 weight = 100;
Server 127.0.0.1: 29111 weight = 100;
Check interval = 30000 rise = 3 fall = 3 timeout = 30000 type = tcp;
}
If you are interested, use this configuration to test that nginx requires the nginx_upstream_check_module plug-in.
The process for launching the application is as follows. If upstreamA is used online now, when we release the Python program, we will monitor the port on the port set by upstreamB, change upstreamA in Nginx configuration to upstreamB and reload to launch the new program.
The problem is that after reload, the error of no live upstream immediately occurs, and all accesses are 502.
The reason has been mentioned before. It is caused by misuse of the check module. Let's take a look at the instructions for using the check module.
Syntax: check interval = milliseconds [fall = count] [rise = count] [timeout = milliseconds] [default_down = true | false] [type = tcp | http | ssl_hello | mysql | ajp] [port = check_port]
Default: If no parameter is configured, the Default value is interval = 30000 fall = 5 rise = 2 timeout = 1000 default_down = true type = tcp.
Context: upstream
Description:
This command enables the health check function of the backend server.
The meaning of the parameters following the command is:
Interval: The interval between health check packets sent to the backend.
Fall (fall_count): If the number of consecutive failures reaches fall_count, the server is considered as down.
Rise (rise_count): If the number of consecutive successes reaches rise_count, the server is considered as up.
Timeout: the timeout value of the backend health request.
[Default_down]: sets the initial state of the server. If it is true, the default state is down. If it is false, it is up. The default value is true, that is, the server is considered to be unavailable at the beginning. It will not be considered healthy until the health check package reaches a certain number of successes.
Type: type of the health check package. The following types are supported:
Tcp: a simple tcp connection. If the connection is successful, the backend is normal.
Ssl_hello: send an initial SSL hello package and accept the SSL hello package of the server.
Http: send an HTTP request. The status of the backend response package is used to determine whether the backend is alive.
Mysql: connects to the mysql server and determines whether the backend is alive by receiving the greeting package of the server.
Ajp: Sends a Cping package of the AJP protocol to the backend, and determines whether the backend is alive by receiving the Cpong package.
Port: specifies the check port of the backend server. You can specify the ports of backend servers different from the real services. For example, if the backend provides an application with port 443, you can check the status of port 80 to determine the backend health status. The default value is 0, indicating the same port as the backend server that provides real services. This option appears in the Tengine-1.4.0.
If you do not carefully consider this problem, use the default configuration directly, and then switch the upstream file to launch the new system, this error will occur.