analysis of RR and Ip_hash strategy in Nginx load Balancing
Nginx upstream currently supports the distribution of load balancing methods
1, RR (default)
Each request is assigned to a different backend server in chronological order, and can be automatically removed if the backend server is down.
For example:
Upstream Tomcats {
Server 10.1.1.107:88 max_fails=3 fail_timeout=3s weight=9;
Server 10.1.1.132:80 max_fails=3 fail_timeout=3s weight=9;
}
2, Ip_hash
Each request is allocated according to the hash result of the access IP, so that each visitor has a fixed access to a back-end server that resolves the session's title.
For example:
Upstream Tomcats {
Ip_hash;
Server 10.1.1.107:88;
Server 10.1.1.132:80;
}
3, Fair (third party)
The response time of the backend server is allocated to the request, and the response time is short for priority assignment.
4, Url_hash (third party)
The request is allocated by the hash result of the access URL, which directs each URL to the same back-end server, which is more efficient when cached.
Below, we analyze the load balancing strategy for RR and Ip_hash. Since each load-balancing strategy is used in the upstream framework, upstream controls the overall workflow, and the load-balancing strategy only provides functions for selecting or releasing the server, so we combine upstream in the analysis of RRs (NGX_HTTP_ UPSTREAM.C). Ip_hash Most of the content is consistent with RR, but only the Ngx_http_upstream_get_peer function in the RR is implemented again.
Two
RR Policy
The RR mechanism is divided into three parts: Initialize upstream, get an available backend server, and release the backend server.
The following analysis takes this configuration as an example:
Upstream Backend {
Server A max_fails=3 fail_timeout=4s weight=9;
Server B max_fails=3 fail_timeout=4s weight=9;
Server C max_fails=3 fail_timeout=4s weight=9;
Server D backup;
Server E Backup;
}
2.1 Initialization of Upstream
For the upstream backend in the example,
First initialize each server, in addition to setting the IP and port number, set the following weight,current_weight,max_fails and Fail_timeout. Where the max_fails and fail_timeout parameters are combined, indicating that the server cannot be accessed if the number of failures reaches max_fails times and remains fail_timeout seconds.
For ServerA, the settings are as follows
Servera.weight = 9;
Servera.current_weight = 9; The initial value is the weight in the configuration file.
Servera.max_fails = 3;
Servera.fail_timeout = 4;
Next, create two server types (the server type is equivalent to the peer type in the following section, which is used to indicate the information for storing a server in the upstream), peers and backup, Store the normal round robin server and standby server separately. and steal the winter
according to the size of the weight value of each server in the array, sort by high。
In this example, the ServerA, ServerB, and ServerC are stored in the array peers, and the total number of servers is recorded peers->number=3; Stores ServerD and ServerE in array backup and records the total number of servers backup->number=2;
Finally, set the values for each variable in the upstream.
RRP represents the array of servers that are currently being wheeled, initially set to UPSTREAM->RRP = peers.
Tries represents the number of attempts, and the value of tries is reduced by one when a server failure is attempted. The total number initially set to peers.
Next means that when the server fails in the peers array, it cannot provide a service, and by Upstream->next, switch to the back array to select the server.
2.2 Specific RR Policies
2.2.1) Select the server to be followed initially, give it to the rrp->current variable, and jump to 2.2.2 when a client request arrives at Nginx, Nginx selects a upstream in peers array The largest server in the weight is the first server to be followed by the current request. The largest algorithm for selecting Current_weight in the peers array is as follows:
Because the server in the peers array is sorted by the size of the weight value.
It is through a double loop that satisfies the following conditions,
if (Peer[n].current_weight * 1000/peer[i].current_weight > Peer[n].weight * 1000/peer[i].weight)//peer[i].current_ Weight is not 0.
And the server's current_weight is greater than 0, select Sever N, assign number n to Rrp->current, successfully returned.
If you are a upstream peersAll servers in the array have zero current_weight, and immediately unconditionally set the current_weight of all servers to the initial values. for (i = 0; i < peers->number; i++) {
Peer[i].current_weight = Peer[i].weight;
}
Then, when the current_weight of all servers is set to the initial value, the current_weight largest server in the peers array is searched again. Assign the number to rrp->current and return it.
2.2.2 Determines whether the current rrp->current point to the server is valid and, if it is invalid, lets rrp->current++ determine whether the next server in the peers array is valid. To until a valid server is found. Jump to 2.2.3; Otherwise jump to 2.2.2.1
decision Serverthe effective method is:
1 The server is valid if the number of failed servers (peers->peer[i].fails) does not meet the maximum number of failures set by Max_fails.
2 If the server has reached the maximum number of failures set by max_fails, from this moment onwards, the server is not valid for the time period set by Fail_timeout.
3 When the number of failed server (peers->peer[i].fails) is the maximum number of failures, when the interval now exceeds the time period set by Fail_timeout, then Peers->peer[i].fails = 0 to make the server work again.
2.2.2.1 if all of the servers in peers are invalid; You will try to find a valid server in the array of backup, if found, jump to 2.2.3; If it is still not found, it means that no server in upstream is available at this time. The record of all failed times in all peers arrays is emptied, making all servers valid. The purpose of this is to prevent a valid server from being found again the next time you request access.
for (i = 0; i < peers->number; i++) {
peers->peer[i].fails = 0;
}
and returns the error code to Nginx, nginx after getting this error code, no longer send a request to the background server,
instead of outputting the "no live upstreams while connecting to upstream" record in the Nginx error log (which is the real cause of no live),and returns directly to the requesting client a 502 error.
2.2.3 when a valid server is found, the current_weight of the server is reduced by one, and then Nginx attempts to establish a connection with the server. If the connection is successfully established, jump to 2.2.4; Otherwise jump to 2.2. 3.1
2.2..3.1 if the nginx is waiting for the proxy_connect_timeout to set the time period (such as 3 seconds), the connection is still not established successfully.
nginx Output "Upstream timed out" in the error log
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.