The load master can provide many kinds of load balancing methods, that is, we often call the scheduling method or algorithm:
Round robin (Round Robin)
This method loops the received requests to each machine in the server cluster, which is the active server. If this is the case, all the servers that are tagged into the virtual service should have similar resource capacities and applications with the same load. If all servers have the same or similar performance then choosing this way causes the server to load identically. Based on this premise, round robin scheduling is a simple and efficient way to allocate requests. However, in the case of different servers, choosing this means that the weaker server will be able to accept the rotation in the next round , even if the server is no longer able to handle the current request. This can lead to overloading of servers with weaker capabilities.
Weighted round robin (Weighted Round Robin)
This algorithm solves the disadvantage of the simple round robin scheduling algorithm: Incoming requests are assigned to servers in the cluster sequentially, but the weights assigned to each server are considered in advance. Administrators simply use the server's processing power to define the weights of each server. For example, the most powerful server a gives a weight of 100, while the least capable server gives a weight of 50. This means that server A will accept 2 requests consecutively before Server B receives the first request, and so on.
Minimum number of connections (Least Connection)
Neither of the above considerations is that the system does not recognize how many connections are maintained at a given time. As a result, Server B servers receive fewer connections than server A but are overloaded because users on Server B continue to open the connection for a longer period of time. This means that the number of connections is the server load is cumulative. This potential problem can be avoided by the "least connections" algorithm: Incoming requests are allocated based on the number of connections that are currently open for each server. That is, the server with the fewest number of active connections automatically receives the next incoming request. This is the same principle as simple polling: All server resources that have virtual services should have a similar capacity. It is worth noting that in the configuration environment with low traffic rate, the traffic of each server is not the same, and the first server will be given priority. This is because if all the servers are the same, then the first server takes precedence until the first server has continuous active traffic, otherwise the first server is always preferred.
Minimum number of connections slow start (Least Connection Slow start time)
For the least number of connections and the minimum number of connections with weights scheduling method, when a server has just joined the online environment, you can configure it for a period of time, during which the number of connections is limited and slowly increased. This provides the server with a ' transition time ' to ensure that the server is not overloaded because of the excessive number of connections that have been allocated since it was first started. This value is set in the L7 configuration interface.
Weighted minimum connection (Weighted Least Connection)
If the server has different resource capacities, then the "weighted least connections" approach is more appropriate: the number of active connections determined by the administrator's weight based on the server's situation generally provides a very balanced utilization of the server, because it draws on the advantages of the fewest connections and weights. Typically, this is a fairly fair allocation because it uses the number of connections and server weights, and the least-scaled server in the cluster automatically receives the next request. Note, however, that when using this method in a low-traffic situation, please refer to the considerations in the "Minimum Connections" method.
Agent-based Adaptive load balancing (Agent Based Adaptive Balancing)
In addition to the methods described above, the load host contains a self-applicable logic to periodically monitor the server state and the weight of the server. For a very powerful agent-based adaptive load balancing approach, the load host can periodically detect all server loads in this way: Each server must provide an include file that contains a 0~99 number to indicate the actual load on the server (0= unprecedented, 99 = overloaded, 101= failed, 102 = Administrator disabled), and the server isomorphic HTTP GET method to obtain this file, and for the server in the cluster, it is also one of the servers to provide self-load as a binary file, however, it does not limit how the server calculates its own load situation. Depending on the overall load situation of the server, there are two strategies to choose from: In a regular operation, the scheduling algorithm calculates a weighting ratio by the ratio of the server load collected and the number of connections allocated to the server. Therefore, if a server is under heavy load, the weights will be re-adjusted transparently through the system. As with the weighted round robin method, improper allocation can be recorded so that different weights can be effectively assigned to various servers. However, in a very low traffic environment, the server reported load value will not be able to create a representative sample, then the allocation of load based on these values will lead to runaway and command oscillation. Therefore, in this case it is more reasonable to calculate the load distribution based on the static weighting ratio. When the load on all servers falls below the administrator-defined lower limit, the load master automatically switches to weighted round robin to allocate requests, and if the load is greater than the administrator-defined lower limit, the load host switches back to adaptive mode.
Fixed weight (Weighted)
The highest weights are used only when the weight values of other servers are low. However, if the highest-weighted server drops, the next highest-priority server will serve the client. The weight of each real server in this way needs to be configured based on the server priority.
Weighted response (Weighted Response)
Traffic scheduling is through a weighted round-robin approach. Weights used in weighted round robin are calculated based on the response time of server validity detection. Each validity test is timed to mark how long it took to respond successfully. However, it is important to note that this approach assumes that the server heartbeat detection is based on the speed of the machine, but this assumption may not always be true. The sum of the response time of all servers on the virtual service is combined to calculate the weight of the individual service's physical server, which is calculated approximately every 15 seconds.
Source IP hashes (source IP hash)
This is done by generating a hash of the requested source IP and using the hash value to find the correct real server. This means that the same server is always the same for the same host. Using this method, you do not need to save any source IP. However, it is important to note that this approach can cause server load imbalance.
1. This article is selected passage by the programmer architecture
2. This article is translated from http://www.loadbalancerblog.com/blog/2013/06/load-balancing-scheduling-methods-explained
3. Reprint Please be sure to indicate this article from : Programmer Architecture (No.:archleaner)
4. More articles please scan the code:
Load Balancing scheduling algorithm