Large Web site Architecture series: Load Balancing detailed (4)

Last Update:2016-01-25 Source: Internet

Author: User

Tags app service server array haproxy

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper is the fourth of load balancing, mainly introduces the three request forwarding modes and eight load balancing algorithms of LVS, as well as the features of Haproxy and load balancing algorithm. For specific reference articles, see the final link.

Three, LVS load balancing

LVS is an open source software that was created in May 1998 by Dr. Zhangwensong, a graduate of the National University of Defense technology, to achieve simple load balancing under the Linux platform. LVS is the abbreviation for Linux virtual server, which means Linux virtualized servers.

The load balancing scheduling technology based on IP layer, which transfers the TCP/UDP requests from IP layer to different servers on the core layer of the operating system, and makes a set of servers into a high performance, high availability virtual server.

Operating system: Liunx

Development language: C

Concurrency performance: default 4096, can be modified but requires recompilation.

3.1. function

The main function of LVS is to achieve the IP layer (network layer) load balancing, there are nat,tun,dr three kinds of request forwarding mode.

Load Balancing cluster with 3.1.1lvs/nat mode

NAT refers to the network address translation, its forwarding process is: The Director Machine receives external requests, overwriting the destination address of the packet, according to the corresponding scheduling algorithm to send it to the corresponding real server, real After the server finishes processing the request, it returns the resulting packet to its default gateway, the director machine, overwriting the source address of the packet and finally returning it to the outside world. This completes the load schedule.

Architecting a load-balanced cluster of the simplest lvs/nat the real server can be any operating system, and without any special settings, the only thing to do is to point its default gateway to the director machine. Real server can use the internal IP (192.168.0.0/24) of the LAN. The director must have two network cards, one NIC binds to an external IP address (10.0.0.1), and the other NIC binds to the internal IP (192.168.0.254) of the local area network, as the default gateway for real server.

The Lvs/nat approach is easiest to implement, and real server uses an internal IP that can save real IP overhead. However, because the execution of NAT requires rewriting the packets flowing through the director, there is a certain delay in speed;

When the user's request is very short, and the server's response is very large, it puts a lot of pressure on the director and becomes a new bottleneck, which limits the performance of the whole system.

Load Balancing cluster with 3.1.2lvs/tun mode

Tun refers to the IP tunneling, its forwarding process is: The Director Machine receives external requests, according to the corresponding scheduling algorithm, through the IP tunnel sent to the corresponding Real server,real Server processing the request, the result packet directly to the customer. The load schedule is completed at this time.

The simplest Lvs/tun-mode load-balancing cluster architecture uses IP tunneling technology to set up an IP tunnel between the director machine and the real server machine, distributing the load to the real server machine via IP tunnel. The relationship between director and real server is relatively loose, either in the same network or in a different network, as long as the two can be connected via IP tunnel. When the real server that receives the load is processed, the feedback data is sent back directly to the customer without having to pass through the director machine. In practice, the server must have a formal IP address to communicate directly with the client, and all servers must support the IP tunneling protocol.

In this way, the Director assigns a customer request to a different Real Server,real server to process the request and responds directly to the user, so that the director handles only half of the client's connection to the server, greatly improving the director's dispatch processing power. Enables the cluster system to accommodate more nodes. In addition, the real server in the Tun mode can be run on any LAN or WAN, which can build a cross-region cluster, and its ability to deal with disasters is stronger, but the server needs to pay a certain amount of resources for IP encapsulation, and the backend real server must be IP-enabled Tunneling's operating system.

Load Balancing cluster with 3.3.3lvs/tun mode

Dr refers to direct Routing, its forwarding process is: The Director Machine receives external requests, according to the corresponding scheduling algorithm directly sent to the corresponding Real server,real Server processing the request, the result packet directly to the customer, Completes a load schedule.

Architecting a Load Balancer cluster in the simplest LVS/DR way real server and director are in the same physical network segment, the director's NIC IP is 192.168.0.253, and then another IP is bound: 192.168.0.254 as a virtual IP to the outside world, external customers through the IP to access the entire cluster system. Real server bindings the ip:192.168.0.254 at Lo while adding the appropriate route.

Lvs/dr Way is similar to the previous Lvs/tun way, the front office director Machine is also only need to receive and dispatch external requests, and do not need to be responsible for returning the feedback results of these requests, so can load more real Server, improve the director's scheduling processing capacity , so that the cluster system accommodates more real servers. But LVS/DR needs to rewrite the MAC address of the request message, so all servers must be in the same physical network segment.

3.3 Architecture

LVS set up a server cluster system consists of three parts: the most front-end load Balancing layer (Loader Balancer), the middle of the server group layer, with server array, the lowest level of data sharing storage layer, with shared storage representation. All applications are transparent to the user, and the user is only using the high-performance services provided by a virtual server.

System Architecture of LVS

Detailed description of LVS at all levels:

Load Balancer layer: At the forefront of the entire cluster system, there is one or more load scheduler (Director server), the LVS module is installed on the director server, and director's main role is similar to a router, It contains the routing tables set up to complete the LVS function, which distribute the user's requests to the application server (Real server) at the server array level through these routing tables. Also, on the director server, you install the Monitoring module Ldirectord for the real Server service, which is used to monitor the health status of each real Server service. When real server is unavailable, remove it from the LVS routing table and rejoin it upon recovery.

Server Array layer: Consists of a set of machines that actually run the app service, one or more of the Web server, mail server, FTP server, DNS server, video server, and each real Servers are connected to each other over a high-speed LAN or across a WAN. In a real-world application, Director server can also be the role of real server concurrently.

Shared storage layer: is a storage area that provides shared storage space and content consistency for all real servers, physically consisting of disk array devices and, in order to provide consistency of content, can generally share data via NFS Network file systems. But NFS in a busy business system, performance is not very good, at this time can use the cluster file system, such as Red Hat GFs file system, Oracle provides the OCFS2 file system and so on.

As can be seen from the entire LVS structure, director server is the core of the entire LVS, currently, the operating system for director server can only be Linux and FreeBSD, The linux2.6 kernel can support LVS without any setup, and FreeBSD as a Director server is not a lot of applications, performance is not very good. For real Server, almost all system platforms, Linux, Windows, Solaris, AIX, BSD series can be very well supported.

3.4 Balancing Strategy

LVS supports eight load balancing policies by default, as outlined below:

3.4.1. Polling scheduling (Round Robin)

The scheduler uses the polling scheduling algorithm to distribute external requests sequentially to the real servers in the cluster, and treats each server equally, regardless of the actual number of connections and system load on the server.

3.4.2. Weighted polling (Weighted Round Robin)

The scheduler Dispatches access requests by the "weighted polling" scheduling algorithm based on the different processing capabilities of the real server. This ensures that the processing capacity of the server can handle more traffic. The scheduler can automatically inquire about the load of the real server and adjust its weights dynamically.

3.4.3. Minimum link (Least Connections)

The scheduler dynamically dispatches network requests to the server with the fewest number of links established through the "least connection" scheduling algorithm. If the real server of the cluster system has similar system performance, the "Minimum connection" scheduling algorithm can be used to balance the load well.

3.4.4. Weighted least link (Weighted Least Connections)

In the case of the server performance difference in the cluster system, the scheduler uses the "Weighted least link" scheduling algorithm to optimize the load balancing performance, and the server with higher weights will bear a large proportion of active connection load. The scheduler can automatically inquire about the load of the real server and adjust its weights dynamically.

3.4.5. Minimum links based on locality (locality-based Least Connections)

The "least link based on locality" scheduling algorithm is a load balancing target IP address, which is mainly used in cache cluster system. According to the target IP address of the request, the algorithm finds the most recently used server, if the server is available and not overloaded, sends the request to the server, if the server does not exist, or if the server is overloaded and has half of the workload of the server, the principle of "least link" is used to select an available server. , the request is sent to the server.

3.4.6. Local least-link with replication (locality-based Least Connections with Replication)

The "least local link with replication" Scheduling algorithm is also a load balancer for the target IP address, which is mainly used in the cache cluster system. It differs from the LBLC algorithm in that it maintains a mapping from a destination IP address to a set of servers, while the LBLC algorithm maintains a mapping from a destination IP address to a server. According to the target IP address of the request, the algorithm finds the corresponding server group of the target IP address, selects a server from the server group according to the principle of "minimum connection", if the server is not overloaded, sends the request to the server, and if the server is overloaded, select a server from this cluster according to the "minimum connection" principle. Join the server to the server group and send the request to the server. Also, when the server group has not been modified for some time, the busiest server is removed from the server group to reduce the degree of replication.

3.4.7. Destination Address hash (Destination Hashing)

The "Target address hash" scheduling algorithm finds the corresponding server from a statically allocated hash list, based on the requested destination IP address, as a hash key (hash key), if the server is available and not overloaded, sends the request to the server, otherwise returns NULL.

3.4.8. Source Address hash (source Hashing)

The "Source address hash" scheduling algorithm, based on the requested source IP address, as the hash key (hash key) from the static distribution of the hash list to find the corresponding server, if the server is available and not overloaded, send the request to the server, otherwise return empty.

In addition to the above load balancing algorithm, you can also customize the equalization strategy.

3.5 scenes

Typically used as ingress load balancing or internal load balancing, combined with a reverse proxy server. The relevant architecture can refer to the Ngnix scene architecture.

4. Haproxy Load Balancing

Haproxy is also a load balancer software that uses more. Haproxy provides high availability, load balancing, and proxies based on TCP and HTTP applications, supporting virtual hosting, and is a free, fast, and reliable solution. Especially useful for those Web sites that are loaded with extra load. The operating mode allows it to be easily and securely integrated into the current architecture while protecting your Web server from being exposed to the network.

4.1. Features

Two proxy modes are supported: TCP (four tiers) and HTTP (seven layer), support virtual host;
Simple configuration, support URL detection back-end server status;
Do load balancing software use, in high concurrency situation, processing speed is higher than nginx;
TCP tiers are used for MySQL slave (read) server load Balancing. (Load-balance MySQL, Detect and load-balance the backend DB node)
Can be added to some of the drawbacks of Nginx such as session retention, cookie guidance and other work

4.2. Equilibrium strategy

Four common algorithms are supported:

1.roundrobin: Polling, alternately assigning to backend servers;

2.STATIC-RR: Based on backend server performance allocation;

3.leastconn: Minimum connector priority;

4.source: According to the request source IP, similar to Nginx Ip_hash.

V. Summary of this sharing

The above is this week's share, from the main explanation of software load Balancing application background, Ngnix load balancing, LVS load Balancing, haproxy load balancing.

Because of the time, some of the explanation is not meticulous, we can ask the/google Niang, hope this sharing to everyone helpful.

Resources:

Schematic diagram of the implementation of Nginx load balancer http://www.server110.com/nginx/201403/7225.html

Nginx Architecture and its Web service building optimization configuration detailed

Http://linux.it.net.cn/e/server/nginx/2015/0102/11183.html

Ngnix dual main scene: Http://network.51cto.com/art/201109/288597.htm

Using LVS framework to load balance Linux cluster system Linux LVS

Http://blog.chinaunix.net/uid-45094-id-3012037.html

LVS Basic Introduction

Http://os.51cto.com/art/201202/317108.htm

Next sharing time: Next week December 9 7 o'clock in the evening 30~~8 Point 30 see. "Large Web site Architecture series: Distributed Message Queuing Technology"

Large Web site Architecture series: Load Balancing detailed (4)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More