Linux Load Balancing Summary description (four-layer load/seven-layer load)

Source: Internet
Author: User
Tags snmp haproxy

Reprint: http://www.cnblogs.com/kevingrace/p/6137881.html

One, what is load balancing
1) Load Balancing (Balance) is built on the existing network structure, which provides an inexpensive and effective way to expand network equipment and server bandwidth, increase throughput, enhance network data processing capabilities, and improve network flexibility and availability. Load balancing has two meanings: first, a large number of concurrent access or data traffic is divided into multiple nodes of the device processing, reduce the time the user waits for response; second, a single heavy load operation is divided into multiple node devices to do parallel processing, each node device processing ends, the results are summarized, returned to the user, System processing capacity has been greatly improved.
2) Simply put: one is to forward a lot of concurrent processing to the back-end multi-node processing, reduce the work response time, and the other is to forward a single heavy work to the backend multiple nodes processing, processing and then return to the Load Balancer Center, and then back to the user. Most of the current load balancing techniques are used to improve the availability and scalability of Internet server programs such as those on Web servers, FTP servers, and other mission-critical servers.

Second, load Balancing classification
1) Two-layer load balancing (MAC)
According to the OSI model, the two-layer load is generally based on the virtual MAC address, external to the virtual MAC address request, the load balancer received after the allocation backend actual MAC address response)
2) Three-layer load balancing (IP)
The general use of virtual IP address, external to the virtual IP address request, load balancer received after the allocation of the backend actual IP address response)
3) Four-layer load balancing (TCP)
On the basis of three load balancing, the Ip+port receives the request and forwards it to the corresponding machine.
4) Seven-layer load balancing (HTTP)
Depending on the virtual URL or IP, the host name receives the request and then turns to the appropriate processing server).

The most common four-and seven-layer load balancing in our operations is the emphasis on these two load balancers.
1) Four-tier load balancing is based on ip+ Port load balancing: On the basis of three-tier load balancing, through the release of three layer of IP address (VIP), and then add four layer of port number, to determine which traffic needs to do load balancing, the traffic needs to be processed NAT processing, forwarded to the background server, and note which server the TCP or UDP traffic is processed by, and all subsequent traffic for this connection is forwarded to the same server for processing.
The corresponding load balancer is called a four-layer switch (L4 switch), the main analysis of the IP layer and TCP/UDP layer, to achieve four-tier load balancing. This type of load balancer does not understand application protocols (such as HTTP/FTP/MYSQL, etc.).
The software for four-tier load balancing is:
F5: A hardware load balancer that functions well, but at a high cost.
LVS: Heavyweight four-tier load software
Nginx: Lightweight four-layer load software with cache function, more flexible regular expressions
Haproxy: Analog four-layer forwarding, more flexible
2) Seven-tier load balancing is based on virtual URL or host IP load balancing: On the basis of four-tier load balancing (no four layer is absolutely impossible seven layer), and then consider the characteristics of the application layer, than like a Web server load balancing, in addition to the VIP plus 80 port to identify whether the traffic needs to be processed, You can also decide whether you want to load balance based on the seven-tier URL, browser category, and language. For example, if your Web server is divided into two groups, one for the Chinese language and one for the English language, the seven-tier load balancer can automatically identify the user's language when the user accesses your domain name, and then select the corresponding language server group for load Balancing.
The corresponding load balancer is called a seven-layer switch (L7 switch), in addition to support four-tier load balancing, but also to analyze the application layer of information, such as HTTP protocol URI or cookie information, to achieve seven-tier load balancing. This kind of load balancer can understand the application protocol.
The software for seven-tier load balancing is:
Haproxy: Natural load balancing skills, full support seven Layer agent, session hold, mark, path transfer;
Nginx: Only in the HTTP protocol and the Mail protocol function better, the performance is similar to Haproxy;
Apache: Poor Functionality
Mysql Proxy: The function is fair.

Generally, LVS do 4-layer load, nginx do 7-layer load, haproxy more flexible, 4-layer and 7-layer load balancing can do

Iii. the difference between the two
1) analysis from the technical principle
The so-called four-layer load balancing, that is, mainly through the message of the destination address and port, coupled with the load Balancer device settings of the server selection method, determine the final choice of internal server.
In the case of common TCP, the load balancer device, when it receives the first SYN request from the client, chooses the best server in the above way and modifies the destination IP address in the message (instead of the backend server IP) and forwards it directly to the server. TCP connection is established, that is, the three-time handshake is established directly between the client and the server, the load balancer device is just a router-like forwarding action. In some deployment situations, in order to ensure that the server back-up can be correctly returned to the load balancer device, while forwarding the message may also be the original source address of the message to modify.

The so-called seven-tier load balancing, also known as "Content Exchange", is the main use of the message in the real meaningful application layer content, coupled with the load Balancer device settings of the server selection method, determine the final choice of internal server.
For example, in the case of TCP, the load balancing device can only accept the message of the real application layer content sent by the client after the server is selected by the actual application layer and then the client must establish a connection (three handshake), and then according to the specific fields in the message, Plus the server selection method of the Load Balancer device setting determines the final selected internal server. Load balancer device In this case, it is more like a proxy server. The load balancer and the front-end clients and the backend servers establish TCP connections separately. So from this point of view, the seven-layer load balancer is significantly more demanding for load balancing devices, and the ability to handle seven layers is bound to be lower than the four-tier mode deployment.

2) Analyzing the requirements of the scenario
      Seven-tier application load is the benefit of making the entire network more "intelligent". You can refer to this article: HTTP application optimization and acceleration instructions-load balancing, you can basically understand the advantages of this approach. For example, access to a Web site user traffic, you can pass the request of the picture class to a specific image server through seven layers, and can use the caching technology, the text class request can be forwarded to a specific word server and can use compression technology. Of course, this is only a small case of seven-tier application, from the technical principle, this way can be the client's request and the response of the server in any sense, greatly improved the application system in the network layer of flexibility. Many of the features deployed in the background, such as Nginx or Apache, can be moved forward to the load balancer device, such as header rewriting in customer requests, keyword filtering in server responses, or content insertion.
      Another feature that is often mentioned is security. The most common SYN flood attack in the network, that is, hackers control many source clients, using a false IP address to send SYN attacks to the same target, usually this kind of attack will send a large number of SYN packets, depletion of the relevant resources on the server to achieve denial of Service (DoS) purposes. It can also be seen from the technical principle that these SYN attacks are forwarded to the backend server in the four-layer mode, whereas in the seven-tier mode these SYN attacks are naturally cut off on the load-balanced device without affecting the normal operation of the backend servers. In addition, the load Balancer device can set up various strategies at seven layers, filter specific messages, such as SQL injection and other application-level attack methods, and further improve the overall security of the system from the application level.
      Now 7-tier load balancing, mainly focused on the application of HTTP protocol, so its application is mainly a number of web sites or internal information platform, such as based on B/s development system. The 4-tier load balancer corresponds to other TCP applications, such as ERP systems based on C/s development.

3) Seven-tier application issues to consider
1. Whether it is really necessary. Seven-tier applications can indeed increase the flow of intelligence, and must not be immune to the complex configuration of equipment, load balancing pressure and the complexity of troubleshooting problems. In the design of the system you need to consider the four-layer seven-layer simultaneous application of the mixed situation.
2. Is it really possible to improve security? For example, a SYN flood attack, seven-tier mode does block these traffic from the server, but the load balancer device itself has a strong anti-DDoS capability, or even if the server is functioning as a central dispatch load Balancer device failure can cause the entire application to crash.
3. Is there sufficient flexibility? The advantage of seven-tier applications is that the flow of the entire application can be intelligent, but the load balancing device needs to provide a complete seven-tier capability to meet the customer's application-based scheduling according to different situations. One of the simplest tests is the ability to replace the scheduler on a server such as a background nginx or Apache. A load balancing device capable of providing a seven-tier application development interface that allows the customer to arbitrarily set functions based on demand, is truly likely to provide great flexibility and intelligence.
4) Overall comparison
1. Intelligence
Seven-layer load balancing with all the features of the Ois seven layer, it is more flexible to handle the user's needs, and in theory, the seven-tier model can modify all of the user's requests to the server. For example, to add information to the header of a file, classify and forward according to different file types. The four-tier model only supports demand forwarding based on the network layer and cannot modify the content requested by the user.
2. Security
Seven-layer load balancing with the full functionality of the OSI model, it is easier to resist attacks from the network; The four-tier model, in principle, forwards the user's request directly to the backend node, which is not able to defend against the network attack directly.
3. Complexity
Four-layer model is generally relatively simple architecture, easy to manage, easy to locate the problem; The seven-tier model architecture is more complex, and it is often necessary to consider the combination of the four-layer model, which is more complex to locate the problem.
4. Efficiency ratio
The four-layer model is based on the lower-level settings, usually more efficient, but with a limited range of applications; The seven-tier model requires more resource loss and is theoretically more powerful than a four-tier model, and the implementation is now more based on HTTP applications.

Four, load Balancing Technology Program Description
There are many different load balancing techniques to meet different application requirements, from the device objects used in load balancing (software/hardware load balancing), the OSI network hierarchy (load balancing at the network level), and the geographic structure (local/global load Balancing) of the application.
1) Software/hardware load balancing
Software load Balancing solution is to install one or more additional software on one or more servers corresponding operating system to achieve load balancing, such as DNS load Balance,checkpoint Firewall-1 connectcontrol,keepalive+ Ipvs, its advantages are based on a specific environment, simple configuration, flexible use, low cost, can meet the general load balancing needs. Software solution Disadvantages are also more, because the installation of additional software on each server will consume the system of non-quantitative resources, the more powerful modules, the more consumption, so when the connection request is particularly large, the software itself will become a key to the success of the server; Software extensibility is not very good, is limited by the operating system, and because of the operating system itself, it often causes security problems.
The Hardware load Balancing solution is to install load balancing devices directly between the server and the external network, which is typically a system-independent hardware that we call a load balancer. Due to specialized equipment dedicated to the task, independent of the operating system, the overall performance has been improved a lot, coupled with a variety of load balancing strategy, intelligent traffic management, to achieve the best load balancing requirements. Load balancers have a variety of forms, in addition to being a standalone load balancer, some load balancers are integrated in the switching device, placed between the server and Internet links, and some are integrated into the PC with two network adapters, one connected to the Internet, A piece of the internal network connected to the backend server farm.
Software load Balancing vs. hardware load balancing:
The advantage of software load balancing is that the demand environment is clear, the configuration is simple, the operation is flexible, the cost is low, the efficiency is not high, can satisfy the common enterprise demand, the disadvantage is to depend on the system, increase the resource overhead, the software's merits and demerits determine the environment performance, the system security, the software stability affects the whole environment security
Hardware load balancing is independent of the system, the overall performance of a large number of improvements in function, performance than the software approach; Intelligent traffic management, a variety of strategies are optional, to achieve the best load balancing effect; The disadvantage is that the price is expensive.
2) Local/global load Balancing
Load balancing is divided into local load balancing (global load Balance, also called geo-load Balancing) from the geographic structure of its application, and local load balancing is the load balancing on the local server farm, Balance Global load balancing is the load balancing of server groups that are placed in different geographic locations with different network architectures.
Local load balancing can effectively solve the problem of excessive data traffic, overloading the network, and do not need to spend expensive to purchase the performance of the server, make full use of the existing equipment, to avoid the loss of data traffic caused by the server single point of failure. It has a flexible and diverse balance strategy to allocate data traffic reasonably to servers in the server farm burden. Even if you extend the upgrade to an existing server, simply add a new server to the service farm without changing the existing network structure and stopping the existing service.
Global load balancing is primarily used for sites that have their own servers in a multi-region, so that global users can access their closest servers with only one IP address or domain name, thus obtaining the fastest access speed. It can also be used for large companies with dispersed distribution sites to achieve the purpose of uniform and rational distribution of resources through intranet (intra-enterprise Internet).
3) Load balancing at the network level
Aiming at the different bottleneck of overloading on the network, we can use the corresponding load balancing technology to solve the existing problems from different levels of the network.
With the increase of bandwidth, data traffic increasing, the network core part of the data interface will face bottlenecks, the original single line will be difficult to meet the demand, and the line upgrade is too expensive or even difficult to achieve, then you can consider the use of link aggregation (trunking) technology.
Link Aggregation technology (second load Balancing) uses multiple physical links as a single aggregation logical link, and network traffic is assumed by all the physical links in the aggregation logic link, thereby increasing the capacity of the link logically to meet the demand for increased bandwidth.
Modern load balancing techniques typically operate on the fourth or seventh layer of the network. Layer Fourth load balancing maps a legally registered IP address on the Internet to the IP address of multiple internal servers, and dynamically uses one of the internal IP addresses for each TCP connection request for load balancing purposes. In layer fourth switches, this equalization technology is widely used, a target address is the server group VIP (virtual ip,virtual IP address) connection request packet flow through the switch, the switch based on the source and destination IP addresses, TCP or UDP port number and a certain load balancing policy, Mapping between the server IP and the VIP, select the best server in the server farm to handle the connection request.

Seven-tier load balancing controls the content of the application layer service, providing a high-level control of access traffic for applications to the HTTP server farm. The seventh tier load balancing technique performs load balancing tasks based on the information in the header by examining the HTTP headers that flow through.
The advantages of seven-tier load balancing are shown in the following areas:
1) by checking the HTTP header, the HTTP400, 500, and 600 series of error messages can be detected, so the connection request can be transparently redirected to another server to avoid application layer failure.
2) According to the type of data flow (such as judging the packet is an image file, compressed file or multimedia file format, etc.), the data traffic to the corresponding content of the server to handle, increase system performance.
3) can be based on the type of connection request, such as ordinary text, images, such as static document requests, or ASP, CGI and other dynamic document requests, the corresponding request to the corresponding server to deal with, improve the system's performance and security.
The disadvantages of seven-tier load balancing are shown in the following areas:
1) The seven-tier load balancer is limited by the protocols it supports (typically HTTP only), which limits the breadth of its application.
2) Seven-layer load Balancing check HTTP header will consume a large amount of system resources, will inevitably affect the performance of the system, in the case of a large number of connection requests, the load balancer itself can easily become the bottleneck of the overall performance of the network.

Five, load Balancing strategy
In real-world applications, we may not want to distribute the client's service requests evenly to internal servers, regardless of whether the server is down or not. Instead, to make the Pentium III server accept more service requests than Pentium II, a server with fewer requests for processing services can allocate more service requests, and the failed servers will no longer accept service requests until they fail to recover, and so on. Select the appropriate load balancing strategy, so that multiple devices can work together to complete the task, eliminate or avoid the existing Network load distribution uneven, data congestion response time long bottleneck. In each load balancing mode, the load balance of the 第二、三、四、七 layer of the OSI Reference Model has a corresponding load balancing strategy for different application requirements.
The advantages and disadvantages of the load balancing strategy and its implementation are two key factors: Load balancing algorithm, detection mode and ability of network system condition.
1. Load Balancing algorithm
1) Round robin (Round Robin): Each request from the network is assigned to the internal server in turn, starting from 1 to N and then restarting. This equalization algorithm is suitable for all servers in the server group with the same hardware and software configuration and the average service request is relatively balanced.
2) weighted round robin (Weighted Round Robin): According to the different processing ability of the server, assign different weights to each server, so that it can accept the service request of the corresponding weight value. For example: The weight of server A is designed to 1,b the weight of 3,c is 6, then server A, B, and C will receive service requests of 10%, 30%, 60% respectively. This equalization algorithm ensures that the high-performance server gets more usage and avoids overloading the server with low performance.
3) Stochastic equalization (random): Randomly assign requests from the network to multiple servers in the interior.
4) Weighted stochastic equalization (Weighted Random): This equalization algorithm is similar to the weighted round robin algorithm, but it is a random selection process when processing the request sharing.
5) Response Speed equalization (Response time): The Load Balancer device issues a probe request (such as ping) to the internal servers, and then determines which server responds to the client's service request based on the fastest response time of the internal servers to the probe request. This equalization algorithm can better reflect the current running state of the server, but the fastest response time is simply the fastest response time between the load balancer device and the server, not the fastest response time between the client and the server.
6) Minimum number of connections (Least Connection): The client's request service at the time of the server stay can be significantly different, with longer working hours, if the use of simple round robin or random equalization algorithm, the connection process on each server may produce a great difference, Does not achieve true load balancing. The least Connection equalization algorithm has a data record for each server in the internal load, records the number of connections currently being processed by the server, and, when there is a new service connection request, assigns the current request to the server with the least number of connections, making the balance more realistic and load balanced. This equalization algorithm is suitable for long-time processing of request services, such as FTP.
7) Processing Capacity equalization: This equalization algorithm will assign the service request to the internal processing load (based on the server CPU model, number of CPUs, memory size and current number of connections) the lightest server, due to the internal server processing capacity and the current network health, So this equalization algorithm is relatively more accurate, especially suitable for use in the case of the seventh Layer (application layer) load balancing.
8) DNS Response equalization (Flash DNS): On the Internet, whether it is HTTP, FTP or other service requests, the client is usually through the domain name resolution to find the exact IP address of the server. Under this equalization algorithm, the load balancer device in different geographic locations receives the domain name resolution request from the same client and resolves the domain name to the IP address of the corresponding server (that is, the IP address of the server with the load balancer in the same location) and returns it to the client at the same time. The client will continue to request the service by resolving the IP address of the first received domain name, ignoring other IP address responses. It is meaningless for local load balancing when the equilibrium strategy is suitable for global load balancing.
2, the network system condition detection method
Although there are a variety of load balancing algorithms can be better to allocate data traffic to the server to load, but if the load balancing policy does not have the network system condition detection mode and ability, once in a server or a load balancing device and server network failure between the case, The Load Balancer device still directs a portion of the data traffic to that server, which is bound to cause a large number of service requests to be lost, without the need for uninterrupted availability. Therefore, a good load balancing strategy should have the ability to detect network failure, server system failure, application service failure, and so on:
1) Ping Detection: By pinging the server and network system status, this method is simple and fast, but can only roughly detect the network and the operating system on the server is normal, the application services on the server detection is powerless.
2) TCP Open detection: Each service will open a TCP connection, detect a TCP port on the server (such as 23 port of Telnet, HTTP 80 port, etc.) is open to determine whether the service is normal.
3) HTTP URL detection: For example, to send an HTTP server to the Main.html file access request, if you receive an error message, the server is considered to be faulty.
3. Other factors
The pros and cons of a load balancing strategy, in addition to the two factors mentioned above, in some applications, we need to assign all requests from the same client to the same server, for example, when the server stores the client registration, shopping and other service request information to save the local database, Assigning a client's child requests to the same server for processing is critical. There are several ways to resolve this issue:
1) One is to assign multiple requests from the same client to the same server according to the IP address, and the corresponding information of the client IP address and server is stored on the load balancer device;
2) The second is to make a unique identifier in the client browser cookie to assign multiple requests to the same server, which is suitable for clients that are online through a proxy server.
3) There is also an out-of-Path return mode (out of Path return), when the client connection request is sent to the load balancer device, the central Load Balancer device directs the request to a server, and the server's response request is no longer returned to the central load balancer device, bypassing the traffic allocator and returning directly to the client , the central load balancer device is only responsible for accepting and forwarding requests, and its network burden is much reduced, and provides faster response times for the client. This mode is typically used for the HTTP server farm, to install a virtual network adapter on each server and to set its IP address as the VIP of the server farm, so that the server directly responds to client requests to achieve a smooth three handshake.

Six, load balancing implementation elements
1) Performance
Performance is an issue we need to focus on when we introduce a balanced approach, but it's also one of the hardest issues to grasp. Performance can be measured by the number of packets per second across the network as one parameter, and the other is the maximum number of concurrent connections that the server farm can handle in a balanced scheme, but assuming that a balanced system can handle millions of concurrent connections but only at 2 packets per second, it is obviously not useful. The performance of the advantages and disadvantages of load-balancing equipment processing capacity, the use of balanced strategy is closely related, and there are two points to note: First, the balance of the overall performance of the server cluster, which is the response to the client connection request speed of the key; second, the performance of the load balancer device itself, Avoid a large number of connection requests when their own performance is not enough to become a service bottleneck. Sometimes we can also consider a hybrid load balancing strategy to improve the overall performance of the server farm, such as DNS load balancing combined with NAT load balancing. In addition, for sites with a large number of static document requests, you can also consider the use of caching technology, relatively more cost-saving, better response performance, for sites with a large number of ssl/xml content transmission, should consider the use of ssl/xml acceleration technology.
2) Scalability
It technology is changing rapidly, a year ago the latest products, now perhaps the network is the lowest performance of the product; The rapid rise in traffic, a year ago the network, now requires a new round of expansion. The right equalization solution should meet these needs, balancing the load between different operating systems and hardware platforms, balancing the load of different servers such as HTTP, Mail, news, proxies, databases, firewalls, and caches, and dynamically adding or removing certain resources in a way that is completely transparent to the client.
3) Flexibility
A balanced solution should be flexible enough to provide different application requirements to meet the changing needs of the application. When different server groups have different application requirements, there should be a variety of balanced strategies to provide a broader choice.
4) Reliability
In sites with high quality of service requirements, the load balancing solution should provide complete fault tolerance and high availability for the server farm. However, when the load balancer device itself fails, there should be a good redundancy solution to improve the reliability. With redundancy, multiple load balancer devices in the same redundant unit must have an effective way to monitor each other, protecting the system from the loss of significant failures as much as possible.
5) Ease of management
Whether it's a balanced solution through software or hardware, we want it to be flexible, intuitive and safe to manage, so it's easy to install, configure, maintain and monitor, improve work efficiency and avoid mistakes. On the hardware load balancing device, there are currently three kinds of management methods to choose from: First, command line interface (Cli:command lines Interface), can be managed by HyperTerminal connection load Balancer device serial interface, also telnet remote login management, when initializing configuration, The former is often used; second, the graphical user interface (gui:graphical users Interfaces), based on the management of ordinary Web pages, but also through the Java Applet for security management, generally requires the management side installed a version of the browser; SNMP (Simple network Management Protocol), supported by third-party network management software, manages SNMP-compliant devices.

Linux Load Balancing Summary description (four-layer load/seven-layer load)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.