1. Introduction
Essentially, network load balancing is an implementation of a Distributed Job Scheduling System. As the controller of network request allocation, the balancer uses a centralized or distributed policy to allocate network service requests based on the current processing capacity of the cluster nodes, it also monitors the valid status of each node in the lifecycle of each service request. Generally, the balancer has the following features for Request Scheduling:
Network Service requests must be manageable
Request allocation is transparent to users
It is best to provide support for heterogeneous systems
Dynamically allocates and adjusts resources based on cluster nodes.
The Load balancer distributes workload or network traffic among each service node of the cluster. The server Load balancer can be configured in a static manner in advance or based on the current network status to determine the specific node to which the load is distributed. nodes can be connected to each other within the cluster, but they must be directly or indirectly connected to the balancer.
A Network Load balancer can be considered as a Job Scheduling System at the network level. Most network load balancers can implement a single system image at the corresponding layers of the network, the whole cluster can be reflected as a single IP address accessed by users, and the nodes of specific services are transparent to users. Here, the balancer can be configured statically or dynamically, and one or more algorithms are used to determine which node gets the next network service request.
2. Network Balancing Principle
In TCP/IP protocol, data contains necessary network information. Therefore, in the specific implementation algorithm of network caching or network balancing, data packet information is very important. However, because data packets are packet-Oriented IP addresses and connection-oriented TCP packets, and are often sliced, there is no complete information related to the application, especially the status information related to the connection session. Therefore, data packets must be viewed from the connection perspective-from the source address port to the destination address port connection.
Another factor for balancing is the resource usage status of the node. Because Server Load balancer is the ultimate goal of this type of system, it can grasp the node load status in a timely and accurate manner and dynamically adjust the task distribution of Server Load balancer according to the current resource usage status of each node, it is another key issue for the Network Dynamic Load Balancing cluster system.
Generally, service nodes in a cluster can provide such information as processor load, application system load, active users, available network protocol caches, and other resources. The information is sent to the balancer through an efficient message mechanism. The balancer monitors the status of all processing nodes and determines who to send the next task. A balancer can be a single device or a group of devices that are distributed in parallel or tree.
3. Basic network load balancing Algorithms
The design of the balancing algorithm directly determines the performance of the cluster in load balancing. Poor design of the algorithm will lead to load imbalance in the cluster. The main task of the general balancing algorithm is to determine how to select the next cluster node and then forward new service requests to it. Some simple balancing methods can be used independently, and some must be used together with other simple or advanced methods. A good Server Load balancer algorithm is not omnipotent. It is generally used only in some special application environments. Therefore, when investigating the load balancing algorithm, you should also pay attention to the applicability of the algorithm itself, and take a comprehensive consideration based on the characteristics of the cluster during cluster deployment, it combines different algorithms and technologies.
3.1 rotation method:
Rotation algorithms are the simplest and easiest way to implement all scheduling algorithms. In a task queue, each member node in the queue has the same status. The rotation method simply rotates and selects nodes in the group. In the load balancing environment, the balancer sends new requests to the next node in the node queue in turn. In this continuous and cyclical manner, each cluster node is selected in turn under the same status. This algorithm is widely used in DNS domain name round robin.
The rotation method is predictable, and the opportunity for each node to be selected is 1/N. Therefore, it is easy to calculate the node load distribution. The rotation method is typically applicable when the processing capability and performance of all nodes in the cluster are the same. In practical applications, it is generally effective when combined with other simple methods.
3.2 hash
HASH). network requests are sent to cluster nodes according to certain rules through a single-shot irreversible HASH function. The hash method is very powerful when several other balancing algorithms are not very effective. For example, in the case of UDP sessions mentioned above, due to the rotation method and other types of connection information-based algorithms, the starting and ending tags of sessions cannot be identified, which may cause application confusion.
The hash ing based on the data packet source address can solve this problem to some extent: send data packets with the same source address to the same server node, this allows transactions based on high-level sessions to run in an appropriate manner. The hash Scheduling Algorithm Based on the target address can be used in the Web Cache cluster. All access requests directed to the same target site are sent to the same Cache service node by the Load balancer, to avoid the Cache update problem caused by missing pages.
3.3 least connection method
In the least connection method, the balancer records all currently active connections and sends the next new request to the node with the minimum number of connections. This algorithm is applicable to TCP connections. However, because the consumption of system resources varies greatly by different applications, the number of connections cannot reflect the actual application load, therefore, when a heavy Web server is used as a cluster node service such as an Apache server), the algorithm will have a discount on the load balancing effect. To reduce this adverse effect, you can set the maximum number of connections for each node by setting the threshold ).
3.4 least missing
In the least missing method, the balancer records requests from each node for a long time and sends the next request to the node with the least request in history. Unlike the least connection method, the least missing record records the number of past connections instead of the current number of connections.
3.5 fastest response method
The balancer records the network response time from itself to each cluster node, and assigns the next connection request to the node with the shortest response time, this method requires the use of ICMP packets or UDP packet-based dedicated technology to actively detect each node.
In most LAN-based clusters, the fastest response algorithm does not work well, because ICMP packets in the LAN basically respond within 10 ms, which does not reflect the differences between nodes; if a balance is made on the WAN, the response time is of practical significance for the user to select the server nearby. In addition, the more scattered the cluster topology, the more effective the method can be. This method is the main method used for advanced Balancing Based on topology redirection.
3.6 encryption Law
Weighted methods can only be used with other methods, which is a good supplement to them. Based on the node priority or the current load status, that is, the weight value), the weighted algorithm forms a multi-priority queue for Server Load balancer. Each connection waiting for processing in the queue has the same processing level, in this way, the same queue can be balanced by the previous rotation or least connection method, while queues are balanced by the order of priority. Here, the weight is an estimate based on the capabilities of each node.
4. Dynamic Feedback for Server Load balancer
When a customer accesses cluster resources, the time required for submitting tasks and the computing resources they consume vary widely, depending on many factors. For example, the service type of the task request, the current network bandwidth, and the current server resource utilization. Computing-intensive queries, database access, and long response data streams are required for some tasks with heavy loads. For task requests with lighter load ratios, you only need to read a small file or perform simple calculations.
The difference in the processing time of the task request may lead to the skewed Skew of the processing node utilization), that is, the load imbalance of the processing node. In this case, some nodes are overloaded, while other nodes are basically idle. At the same time, some nodes are too busy to have a long request queue and are constantly receiving new requests. On the other hand, this will lead to a long wait for the customer, and the overall service quality of the cluster will decline. Therefore, it is necessary to adopt a mechanism so that the balancer can understand the load status of each node in real time and make adjustments based on the load changes.
A Dynamic Load Balancing Algorithm Based on the negative feedback mechanism is adopted in specific practice. This algorithm considers the Real-Time Load and response capabilities of each node and constantly adjusts the proportion of task distribution, to avoid receiving a large number of requests when some nodes are overloaded, thus improving the overall throughput of a single cluster.
The server monitoring process is run on the server Load balancer in the cluster. The monitoring process monitors and collects the load information of each node in the cluster. The client process is run on each node, periodically reports the Load Status to the balancer. The monitoring process performs Synchronization Based on the load information of all nodes received, and distributes the tasks to be allocated in proportion to their weights. The weights are calculated based on the CPU utilization, available memory, and disk I/O status of each node. If the difference between the new weights and the current weights is greater than the set threshold value, the monitor uses a new weight value to re-distribute tasks within the cluster until the next load information is synchronized. The balancer can work with dynamic weights and use the Weighted Round robin algorithm to schedule accepted network service requests.
4.1 Weighted Round Robin Scheduling
Weighted Round Robin Scheduling (Weighted Round-Robin Scheduling) algorithms use the corresponding weights to indicate the processing performance of nodes. This algorithm distributes task requests to each node according to the order of weights and polling methods. Nodes with higher weights process more task requests than those with lower weights. nodes with the same weights process requests with the same shares. The basic principles of Weighted Round Robin are described as follows:
Suppose a cluster has a group of nodes N = {N0, N1 ,..., Nn-1}, W (Ni) indicates the weight of node Ni,
An indication variable I indicates the server selected last time. TNi indicates the number of tasks currently allocated by node Ni.
Σ T (Ni) indicates the total number of tasks to be processed in the current synchronization cycle.
Σ W (Ni) indicates the sum of node weights.
W (Ni)/Σ W (Ni) = T (Ni)/Σ T (Ni)
It indicates that the task is allocated according to the ratio of each node's weight to the total number of weights.
4.2 Weight Calculation
When a cluster node is used in the system for the first time, the system administrator sets an initial weight for each node based on the hardware configuration of the node. this parameter is usually defined based on the hardware configuration of the node, the higher the hardware configuration, the higher the node default value). This weight is also used on the server Load balancer. Then, the weight is adjusted as the node load changes.
Dynamic weights are calculated by parameters in various aspects during node operation. In the experiment, we selected the most important items, including CPU resources, memory resources, current number of processes, response time, and other information as the factor in the calculation formula. The size of the new weight can be calculated based on the current weight of each node. The purpose of dynamic weights is to correctly reflect the node load to predict potential load changes of the node in the future. For different types of system applications, the importance of each parameter is also different. In a typical Web application environment, the available memory resources and response time are very important. If the user focuses on long database transactions, the CPU usage and available memory are relatively important. To adjust the proportion of each parameter for different applications during system operation, we set a constant coefficient Ri for each parameter to indicate the importance of each load parameter, where Σ Ri = 1. Therefore, the weight formula of Ni at any node can be described as follows:
LOAD (Ni) = R1 * Lcpu (Ni) + R2 * Lmemory (Ni) + R3 * Lio (Ni) + R4 * Lprocess (Ni) + R5 * Lresponse (Ni)
Lf (Ni) indicates the load value of a parameter of node Ni,
In the preceding formula, CPU usage, memory usage,
Disk I/O recovery rate, total number of processes, and response time.
For example, in a WEB server cluster, we use a coefficient of {0.1, 0.4, 0.1, 0.1}. Here we think the server's memory and request response time are more important than other parameters. If the current coefficient Ri does not reflect the application load well, the system administrator can constantly correct the coefficient until a group of coefficients close to the current application are found.
In addition, although the periodic setting of the collection weight can reflect the load of each node more accurately in a short period, it is frequently collected for example once or multiple times in one second) load the balancer and nodes and unnecessary network load. In addition, because the collector performs load computing at the time of collection, experiments have proved that the load information of each node on the balancer will suffer severe jitter, the balancer cannot accurately capture the real load change trend of nodes. Therefore, to solve these problems, on the one hand, we need to adjust the cycle for collecting load information properly, generally in the range of 5 ~ 10 seconds; on the other hand, the moving average line or sliding window can be used to avoid jitter, so that the load information collected by the balancer is displayed as a smooth curve, in this way, it will be better to adjust the negative feedback mechanism.
The dynamic weight acquisition program of the balancer runs cyclically. If the default weight is not zero, the server LOAD parameters of the node are queried and the dynamic weight LOAD (Ni) is calculated ). The following formula is used to calculate the final weight value based on the initial weight of the node and the collected dynamic weight.
Wi = A * DW (Ni) + B * (LOAD (Ni)-DW (Ni) 1/3
In the formula, if the dynamic weight is equal to the initial weight and the final weight is not changed, it means that the load condition of the system has just reached the ideal state, and it is equal to the initial weight DW (Ni ). If the calculation result of the dynamic weight is higher than the initial weight, the final weight increases, which means that the system load is very light and the balancer will increase the task ratio allocated to the node. If the dynamic weight is lower than the initial weight, the final weight is reduced, indicating that the system is starting to overload, and the balancer will reduce the tasks assigned to the node. In actual use, if the weights of all nodes are smaller than their DW (Ni), it indicates that the current cluster is overloaded, in this case, you need to add a new node to the cluster to process some of the load. Otherwise, if the weights of all nodes are much higher than DW (Ni), the load of the current system is lighter.
5. Summary
Network load balancing is the specific implementation of the Cluster Job Scheduling System. Because the job unit it processes is a network connection under the TCP/IP protocol, you can use a centralized basic scheduling algorithm for network connections. Considering the possibility of Cluster load imbalance, the weights of service nodes are dynamically obtained and the negative feedback mechanism is used to adjust the distribution of network service requests by the balancer, to adapt to resource changes during the running of service nodes. Based on the LVS cluster system, I also improved it with the original round robin algorithm, added a program for collecting dynamic weights and fed back it to the Load balancer's scheduling system in real time. Practice has proved that the use of dynamic balancing has improved the overall throughput of the cluster system, especially when the performance of each node in the cluster is different and the network service programs provided by the cluster have diversified resources, the negative feedback mechanism is particularly effective. In other clusters, the dynamic load balancing of the negative feedback mechanism can also be well applied, except that the job units processed by the balancer are different from network connections, the specific load algorithms will also be different.