Summary of Load Balancing

Last Update:2016-01-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Issues to consider
　　
Before proposing a specific load balancing solution, we need to start by explaining some of the things we need to consider when designing a load balancing system.
　　
The first thing to say is to pay attention to the high availability and scalability of the load balancing system when it is designed. In the first explanation, we've already mentioned that by using load balancing, services made up of many server instances are highly available and scalable. When one of the service instances fails, other service instances can help it share part of the work. While the total service capacity seems a bit tense, we can add new service instances to the service to extend the total capacity of the service.
　　
However, since all data transfers require a load-balanced server, once the load-balancing server fails, the entire system will become unusable. In other words, the availability of a load-balanced server affects the high availability of the entire system.
　　
The method to solve this problem is discussed based on the type of load balancing server. For L3/4 load-balanced servers, a common approach in the industry is to use a pair of load-balanced servers in the system in order to make the entire system non-invalidated. When one of the load-balancing servers fails, the other can also provide load balancing services for the entire system. This pair of load balancing servers can be used in active-passive mode or in active-active mode.
　　
In active-passive mode, a load-balanced server is in a semi-dormant state. It detects the availability of the other by sending a heartbeat message to another load-balancing server. When a working load-balancing server is no longer responding to a heartbeat, the heartbeat app wakes the load-balanced server from the half-sleep state, takes over the IP of the load-balanced server, and begins to perform load balancing.

In active-active mode, the two load-balanced servers work simultaneously. If one of the servers fails, then the other server will assume all the work:

It can be said that both have their merits. In contrast, the active-active mode has a good resistance to large fluctuations in traffic. For example, under normal circumstances, the load of two servers is around 30%, but in the peak hours of service use, the traffic may be twice times the usual, so two server load will reach about 60%, is still within the scope of the system can be processed. If we are using the active-passive mode, then the load will reach 60%, while the load at peak time will reach 120% of the load-balanced server capacity, so that the service cannot process all user requests.
　　
Conversely, the active-active model also has a bad place, which is prone to management negligence. For example, in a system that uses active-active mode, the load of two load-balanced servers is around 60% all year round. Then once one of the load-balancing servers fails, the only remaining server will be unable to process all user requests.
　　
Perhaps you will ask: L3/4 load balancing server must have two? In fact, mainly by the load balancing server products themselves to determine. As we've already said before, actually probing the availability of a load-balanced server actually requires complex test logic. So if we use too many L3/4 load balancing servers in a load-balanced system, the various heartbeat tests sent between these load-balanced servers consume a lot of resources. And because many L3/4 load-balancing servers are inherently hardware-based, they can work very fast and even meet the processing power that matches the network bandwidth that they support. Therefore, in general, the L3/4 load Balancing server is used in pairs.
　　
If the L3/4 load balancing server is really close to its load limit, we can also distribute requests through DNS load balancing:

This approach can not only solve the problem of extensibility, but also take advantage of one of the features of DNS to improve the user experience: DNS can select the closest server to the user based on the region of the user. This is particularly effective in a global service. After all, a Chinese user visiting a service instance in China is much faster than accessing a service instance built in the United States.
　　
In turn, because L7 load-balancing servers are primarily software-based, many L7 load-balancing servers allow users to create more complex load-balanced server systems. For example, define a set of L7 load Balancing servers with two enabled and one standby.
　　
After explaining the high availability, let's introduce the scalability of the load balancing server. In fact, as we've just described, the L3/4 load Balancing server has high performance, so the load-balancing system used by the General Service does not experience the need for scalability. However, once there is a need to expand, the use of DNS load balancing can achieve good scalability. L7 load Balancing is more flexible, so scalability is not a problem.
　　
However, a load-balancing system cannot be made up of L3/4 load-balanced servers, or only by L7 load-balanced servers. This is because both have a very big difference in both performance and price. A L3/4 load balancing server is actually very expensive and often reaches tens of thousands of dollars. The L7 load balancing server can be built with inexpensive servers. L3/4 load-balancing servers often have very high performance, while L7 load-balancing servers often achieve higher overall performance by composing a cluster.
　　
One more thing to consider when designing a load-balancing system is the static and dynamic separation of services. We know that a service is often made up of both dynamic and static requests. These two requests have very different characteristics: A dynamic request often requires a lot of computation and the data is not often transmitted, while a static request often requires a large amount of data to be transferred without much computation. Different service containers have a significant difference in the performance of these requests. As a result, many services often divide the service instances they contain into two parts to handle static and dynamic requests, and to provide services using the appropriate service container. In this case, the static request is often placed under a specific path, such as "/static". This allows the load-balancing server to forward dynamic requests and static requests appropriately, depending on the path to which the request is sent.
　　
The last thing to mention is a software implementation of LVS (Linux Virtual server) for the L3/4 load balancing server. Compared with the hardware implementation, the software implementation needs to do a lot of extra work, such as decoding the packet, allocating memory for processing packets, and so on. Therefore, its performance is often only 1/5 to 1/10 of the L3/4 load balancing server with the same hardware capabilities. Given its limited performance but low build-up costs, such as the use of existing machines that are idle in the lab, it is often used as a temporary alternative when the service is not very large.

Load Balancing Solutions
　　
At the end of the article, we will give you a list of common load balancing solutions for your reference.
　　
In general, the load of a service is often increased by some means. Correspondingly, the load-balancing systems that these services have are often evolved from small to large. As a result, we will introduce these load balancing systems in a gradual fashion from small to large.
　　
The first is the simplest system that contains a pair of L7 load Balancing servers:

If the load of the service increases gradually, the only L7 load balancing server in the system can easily become a bottleneck. At this point we can solve the problem by adding an SSL farm and a server running LVS:

If we are also dealing with increased load, then we need to replace LVS with real hardware-based L3/4 load balancing servers and increase the capacity of each tier:

Since the underlying three layers of the solution are theoretically infinitely scalable, the most likely overload is the top L3/4 load balancing server. In this case, we need to use DNS to allocate the load:

Summary of Load Balancing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Summary of Load Balancing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Summary of Load Balancing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support