reproduced from this blog: http://blog.csdn.net/u014649204/article/details/25115039
the design of the balance algorithm directly determines the cluster in Load Balancing performance, poorly designed algorithms can lead to unbalanced load on the cluster. The main task of the general balancing algorithm is to decide how to select the next cluster node. The new service request is then forwarded to it. Some simple balancing methods can be used independently, and some must be combined with other simple or advanced methods. And a good load-balancing algorithm is not omnipotent, it is generally only in some special application environment talent to maximize the effectiveness. So in the same time that the load balancing algorithm is examined. Also pay attention to the application of the algorithm itself, and in the cluster deployment of the time according to the characteristics of the cluster's own comprehensive consideration, the different algorithms and technologies combined to use.
3. 1 Rotation method:
The rotation algorithm is one of the simplest and easiest implementations of all scheduling algorithms.
In a task queue. Each member (node) of a queue has the same status. The rotation method is simple in order to rotate the selection in this group of members.
In a load-balanced environment. The equalizer rotates the new request to the next node in the node queue, so that it is continuous and cyclical. The nodes of each cluster are chosen in the same position. This algorithm is widely used in DNS domain name polling.
The rotation activity is predictable, and the chance of each node being chosen is 1/n. So very easy to calculate the load distribution of the nodes.
The rotation method is typical for all nodes in the cluster, the processing capacity and performance are the same, in practical applications, it is generally used in conjunction with other simple methods are more effective.
3. 2 Hash method
Hashing is also called hashing (hash), through the single-shot non-reversible hash function. The network request is sent to the cluster node according to a certain rule. Hashing show special power when several other types of equilibrium algorithms are not very effective.
For example, in the case of UDP session mentioned earlier, because of the rotation method and several other kinds of algorithms based on connection information, it is not possible to recognize the beginning and end of a session, which can cause confusion in application.
A hash map based on the source address of the packet can solve the problem to a certain extent: the packets with the same source address are sent to the same server node, which enables the transaction based on the high-level session to execute in an appropriate manner.
In contrast, the hash scheduling algorithm based on the destination address can be used in the Web cache cluster, and the access requests to the same target website are sent to the same cache service node by the load balancer to avoid the update cache problem caused by the missing page.
3. 3 Minimum Connection method
In the least-connection method, the Balancer records all active connections at the moment. Send the next new request to the node that currently contains the least number of connections. Such an algorithm is intended for TCP connections. However, the consumption of system resources may vary greatly depending on the application. The number of connections does not reflect the actual application load, so when using heavy Webserver as a cluster node service (such as Apacheserver). The algorithm has a discount on the effect of balancing load. In order to reduce this adverse effect. The maximum number of connections that can be set for each node (reflected by the threshold setting).
3. 4 Minimum Missing method
In the lowest missing method, the balancer records the request of each node for a long time and sends the next request to the node with the fewest requests in history. Unlike the least connection method. The lowest missing record is the past number of connections, not the current number of connections.
3. 5 Fastest Response method
The balancer logs itself to the network response time of each cluster node. The next incoming connection request is assigned to the node with the shortest response time, which requires the use of ICMP packets or UDP packet-based specialized techniques to proactively explore the nodes.
In most LAN-based clusters, the fastest response algorithm does not work very well, because the ICMP packets in the LAN are basically in the 10ms response, not the difference between the nodes; assuming a balance on the WAN, The response time is still of practical significance for the user to select the server at the nearest, and the more dispersed the cluster topology, the more effective the method will be.
Such a method is the primary method of advanced balancing based on topology redirection.
3. 6 Weighting method
The weighted method can only be used in combination with other methods. is a very good addition to them. The weighted algorithm forms a multi-priority queue of load balancing based on the priority of the node or the current load state (i.e. weights), and each waiting connection in the queue has the same level of processing, so that the same queue can be balanced in accordance with the previous rotation or the least-connection method, The queues are balanced in priority order.
Here the weights are based on an estimate of the capabilities of each node.
Specific instructions on the web
An algorithm for load balancing
We know that the role of load balancer in load balancing devices is crucial, and it plays a connecting role. On the one hand, receives the user's network request, on the other hand transfers the request according to some algorithm to the specific application server, realizes the load balance. Therefore, the algorithms in the load balancer are critical. Most of the load balancing devices implement the following algorithms.
1. Polling scheduling
The polling schedule (Round Robin scheduling) algorithm is a polling method that dispatches requests to different servers in turn. That is, each time the dispatch runs i = (i + 1) mod n, and select the I server. The advantage of the algorithm is its simplicity. It does not need to record the status of all current connections, so it is a stateless dispatch.
In the actual implementation process. A weighted value is usually set for each server, which is the weight polling scheduling algorithm.
2. Minimum connection scheduling (least-connection scheduling)
The minimum connection scheduling (Least-connection scheduling) algorithm is to allocate new connection requests to the server with the smallest number of current connections.
Minimum connection scheduling is a dynamic scheduling algorithm that anticipates server load conditions through the number of connections currently active by the server.
In the actual implementation process, a weight value is usually set for each server, which is the weighted minimum connection schedule (Weighted least-connection scheduling)
3. Minimum link based on locality (LBLC)
The least-link scheduling based on locality (locality-based Least Connections Scheduling, hereinafter referred to as LBLC) is a load balancing dispatch for the target IP address of the request message, which is mainly used in the cache cluster system. Because the destination IP address of the client request message in the cache cluster is variable.
The LBLC scheduling algorithm first finds the server that is used by the target IP address according to the destination IP address of the request, if the server is available and is not overloaded, sends the request to the server; If the server does not exist. Or the server is overloaded or has a server that is half the workload, a "least-link" principle is used to select an available server. Sends the request to the server.
4, with replication based on the least local link (LBLCR)
Local least-Link scheduling with Replication (locality-based Least Connections with Replication scheduling. The following abbreviation is LBLCR) the algorithm is also load balanced against the destination IP address. Now it is mainly used in the cache cluster system. It differs from the LBLC algorithm in that it maintains a mapping from a destination IP address to a set of servers, while the LBLC algorithm maintains a mapping from a destination IP address to a server.
The LBLCR scheduling algorithm maps a "hot" site to a set of Cacheserver (server collections). When the request payload of the "hot" site is added, the Cacheserver in the collection is added to handle the growing load; When the request load of the "hot" site is reduced, the number of Cacheserver in the collection is reduced. In this way, the image of the "hot" site is unlikely to be present on all cacheserver, thus providing the efficiency of the cache cluster system.
5, the target address hash dispatch (Destination Hashing scheduling)
The target Address hash schedule (Destination Hashing scheduling) algorithm is load balanced against the destination IP address, but it is a static mapping algorithm that maps a destination IP address to a server through a hash function.
The target address hash scheduling algorithm first finds the appropriate server from a statically allocated hash list, based on the destination IP address of the request, as a hash key (hash key), and if the server is available and not overloaded, sends the request to the server, otherwise returns NULL.
6. Source Address Hash Dispatch (source Hashing scheduling)
Similar to the destination address hash schedule. The only difference is that the source address of the hash key is hashed based on the hash function.
In practice, the source address and the destination address hash dispatch schedule can be used in conjunction with a firewall cluster, and they can ensure the only entry for the entire system.
Load Balancing basics (frequently used) algorithm