Article Title: Study on Multi-nic Server Load balancer under Linux server. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
1 Introduction
Today, almost all industries have built their own servers. Due to the special position of servers, their reliability, availability and I/O speed become very important, maintaining the high availability and security of servers is an important indicator of the enterprise IT environment. The most important thing is the high availability of server network connections. To meet these requirements, currently, most servers use multi-nic configurations, and most of the systems use the popular Linux operating environment. Bandwidth is no longer the bottleneck for improving service quality, and the processing capabilities of network devices and servers gradually become a new bottleneck. To improve the availability and reliability of server network connections, at present, Sun's Trunking technology, 3Com's DynamicAccess technology, and Cisco's Etherchannel technology are all studying the link aggregation technology that binds multiple Nic interfaces of the server, the link aggregation technology virtualizes multiple links into one logical link, which provides a cheap and effective way to expand the bandwidth of network devices and servers and improve network flexibility and availability.
This article introduces the bonding Technology in Linux. This technology is used in Linux 2.4.x kernel. With bonding technology, you can bind multiple Nic interfaces to a virtual Nic, in the user's opinion, this aggregated device seems to be a separate Ethernet interface device. Generally speaking, multiple NICs have the same IP address and are connected in parallel and aggregated into a logical link. In Linux bonding technology, there are several algorithms to achieve load balancing requirements. This article analyzes and studies these algorithms here to discuss their shortcomings, in addition, an improved Load Balancing Implementation Method Based on transmission protocol is proposed. This article discusses how to implement the balancing of multiple network interfaces and failover.
2 Introduction to Load Balancing Technology and High Availability Technology
2.1 Server Load balancer Technology
The main idea of Server Load balancer technology is how to evenly distribute network service traffic to different servers and network devices based on an algorithm to relieve the burden on a single server and network device, this improves the efficiency of the entire system. Server Load balancer can be implemented either by hardware with the Server Load balancer function or by some special software. Server Load balancer is a policy, it allows multiple servers or links to undertake some heavy computing or I/O tasks, eliminating network bottlenecks at a low cost, and improving network flexibility and reliability.
2.2 High Availability Technology
The implementation of Server Load balancer is first proposed based on the high availability of the network. The High Availability Technology is a branch of the Fault Tolerance Technology. Redundancy is the simplest way to achieve high availability of the system. The complete network load balancing and high availability network technology are composed of two aspects: one is multi-server binding and load balancing, and the other is a server Load balancer bound with multiple NICs, here we mainly discuss the Server Load balancer when multiple NICs are bound to a server.
3 simple implementation of Server Load balancer in Linux bonding Technology
3.1 Linux bonding Technology
The Linux bonding technology is a virtual layer implemented above the NIC Driver and under the data link layer. Through this technology, multiple NICs connected to the server on the switch are not only bound to an IP address, the MAC address is also set to the same one, which forms a virtual Nic. the workstation requests data from the server. After the NIC on the server receives the request, the network adapter intelligently determines who processes data transmission based on a certain algorithm. Bonding technology can improve the network throughput and availability of hosts.
3.2 several sending balancing algorithms in Linux
At present, there are three main sending algorithms in Linux: Round Robin algorithm, Active-Backup algorithm, MAC address XOR algorithm (MAC-XOR ). The following is a simple analysis of the three main algorithms.
3.2.1 Rotation Algorithm
The algorithm is based on the principles of fairness. It selects the sending interface for each data packet to be sent. The main idea of the algorithm is that the first data packet is sent by one interface, another data packet is sent by another interface, and the following is selected cyclically. Through analysis, we can see that this algorithm is relatively simple and fair in terms of data sending. It can ensure Load Balancing when the NIC sends data, and the resource utilization is very high. However, we know that if a connection or session data packet is sent from different interfaces and then goes through different links in the middle, there may be a problem of unordered data packets arriving at the client, unordered packets need to be sent again, so the network throughput will decrease.
3.2.2 backup algorithm
This algorithm sets one interface of multiple Nic interfaces as active, and other interfaces as standby. When an active interface or active link fails, start the standby link. This algorithm can provide high network connection availability, but its resource utilization is low, only one interface is in the working state. When there are N network interfaces, the resource utilization rate is 1/N.
[NextPage]
3.2.3 MAC address difference or Algorithm
The main idea of this algorithm is: the MAC address of the server and the MAC address of the client jointly determine the sending port number of each data packet, and the source MAC address and the target MAC address are used for XOR calculation, perform the remainder calculation on the number of interfaces based on the variance or result. Because the data streams sent to the same client pass through the same link, data packets can arrive at the client in an orderly manner. When only one client accesses the server or the server and the client are not in the same subnet, the algorithm determines that the load will not be balanced in this case. When only one client accesses the server, resource utilization is also 1/N (N is the number of interfaces ).
A large LAN usually has multiple subnets. The topology is as follows:
LAN topology
4. transmission protocol-based sending Algorithm
We have analyzed several Server Load balancer algorithms for implementing multiple Enis in Linux. To address these shortcomings, we propose another Server Load balancer algorithm.
4.1 Algorithm Description implementation
We know that there are two network transmission protocols: TCP and UDP. UDP is a connectionless and unreliable transmission protocol. TCP is a connection-oriented and reliable byte stream service. For example, a client and a server need to establish a connection before exchanging data with each other. The structure of a TCP connection or UDP session is roughly as follows:
{Source, dst, saddr, daddr}
Source is the source port number, dst is the destination port number, saddr is the source IP address, and daddr is the destination IP address.
The main idea of transmission protocol-based sending algorithms is: the sending interface number of a data packet is jointly determined by the destination host number, the subnet number of the destination host, and the TCP or UDP destination port number of the session, this algorithm is similar to the MAC address or algorithm because it is also an exclusive or computing algorithm.
The following is an agreement:
(1) host is the destination host number of the data packet to be sent.
(2) subnet is the subnet number of the destination host.
(3) port is the destination port number of the UDP or TCP connection.
(4) slave_cnt indicates the number of bound interfaces.
If slave_cnt is 4, the following operations are performed to determine the interface number to be sent by calculating the remainder of the conditions mentioned above, that is, when slave_cnt is 4:
(Host ^ subnet ^ port) & (0x03) % slave_cnt ① Formula
The above formula may return 0, 1, 2, or 3. That is, this algorithm can bind up to four Nic interfaces.
This algorithm sends data packets with different connections from different interfaces to the maximum extent. The following situations are discussed:
(1) For two TCP connections of the same client, the interface number sent is only related to the destination port number to be connected. Assume that the destination port numbers for connection 1 and connection 2 are port1 and port2 respectively. When the last two digits of port1 (Binary) and port2 (Binary) are different, the calculation results of the previous algorithm are not equal. The two connected data streams are sent from different interfaces.
(2) For two TCP connections of different clients in the same subnet, The subnet in the above algorithm is the same, assuming that the destination port of the connection is the same. But the host number is different. Set the destination host numbers to host1 and host2 respectively. When the last two digits of host1 and host2 (both expressed in binary) are inconsistent, the calculation results of the algorithm formula are not equal, the two connected data streams are sent from different interfaces.
[NextPage]
(3) For two TCP connections in different subnets, it is assumed that the host number and the destination port number of the connection are the same, and the subnet number is different. Set the subnet numbers of connection 1 and connection 2 to subnet1 and subnet2 respectively. When the last two digits of the two are not the same, the calculation results of the algorithm formula are not equal, at this time, the two connected data streams are sent from different interfaces.
When there are N Nic interfaces, the number of connections sent from the I interface in a certain period of time is (I = ,.... N), the data stream sent by the j-connected interface is, the load of the I-nic interface is:
Formula ②
When formula ② is set up, the load of each interface is absolutely balanced. Since formula ① can allocate connections to different interfaces as much as possible, it is generally equal to or equal ...... That is, the number of connections for each interface is equal, but the data traffic for each connection is not necessarily equal, that is, formula ② is not necessarily true. According to the statistical principle, when the number of connections between the client and the server is large enough and the time is long enough, formula ② is true.
From the analysis of the Load Balancing Algorithm Based on the transmission protocol, this algorithm not only achieves Load Balancing for each interface at the network layer and transport layer, but also ensures that data arrives at the client in an orderly manner, at the same time, the resource utilization is also high.
5. Verify and analyze the test results
Test software environment: RedHat9.0 (kernel 2.4.20)
Test the hardware configuration environment: one server (CPU: p iv 2.8 GB; Memory: 512 MB; two 10-Gigabit NICs that support MII status word registers, each with an interface; one client (same as the server); two (one can also) 24-port Gigabit Switch. The test software is netpipe, which can be used to test the TCP protocol performance. It is used in turn to determine the network latency and throughput of the rotation algorithm, MAC address variation or algorithm, and transmission protocol-based transmission algorithm. Here the server is used to send data, and the client is used to receive data.
The server sender executes:
NPtcp? T? S? H 172.19.11.130? O test.ppt? P
The client receiver executes:
NPtcp? R? S
Table 1 shows the test results (the average value of the test results.
The test results show that the rotation algorithm is simple, computation is small, and network latency is relatively small. The Transmission Algorithm Based on transmission protocol has a large amount of computing and network latency. This is a dual-machine test, that is, there is only one client and one server, and the results of all MAC addresses or algorithms are unique each time. Only one interface is used and the throughput is minimal. Relatively speaking, the transmission protocol-based sending algorithm has a higher throughput.
Table 1 test results
During the test, only two NICs are configured on the server and only one client accesses the server. When the NIC interface and client are added, the advantages of the transmission protocol-based sending algorithm described in this article will be obvious. When multiple clients establish connection conversations with the server in a large LAN, data packets of the same connection using the Rotation Algorithm are transmitted through different links. The probability of unordered access to the client increases, the number of resends also increases, and the server throughput will decrease. The transfer protocol-based sending algorithm does not have such a problem. In this case, the server throughput increases.
6 conclusion
The Linux bonding Technology binds multiple Nic interfaces and uses multiple interfaces to send data. In algorithm, load balancing and Failover are implemented. It is an asymmetric load balancing technology. Currently, it only studies the sending algorithm, and the receiving algorithm remains to be further studied, currently, the link technology is to bind Nic interfaces to improve the network performance of the server. However, various implementation algorithms, including transmission protocol-based sending algorithms, do not consider the interface speed, further improvement is necessary.