Web cluster and load balancing

Last Update:2018-07-26 Source: Internet

Author: User

Tags failover time limit

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original reference: Http://www.cnblogs.com/lovingprince/archive/2008/11/13/2166350.html

A Web cluster is made up of multiple servers running the same Web application at the same time, as if it were a server, with multiple servers working together to provide more high-performance services to customers. Cluster more standard definition is: A set of independent server in the network performance of a single system, and a single system of the mode of management, this single system for customer workstations to provide high reliability services. The task of load balancing is to be responsible for the reasonable task assignment between multiple servers (within the cluster), so that these servers (clusters) do not appear due to one overload, while the other servers do not fully play the processing power. Load balancing has two meanings: first of all, a large number of concurrent access or data traffic sharing on multiple nodes to deal with, reduce the user waiting for response time; second, a single high load of operations to share multiple nodes to do parallel processing, each node device processing end, the results are summarized, and then returned to the user, Make the information system processing ability can be greatly improved
Therefore, it can be seen that the cluster and load balancing are essentially different, they are different solutions to solve two problems, do not confuse.

Cluster technology can be divided into three main categories:
1. High performance cluster (HPC Cluster)
2. High Availability cluster (HA Cluster)
3, High scalability Cluster

1. High performance cluster (HPC Cluster)Refers to the cluster technology which aims at improving the scientific computing ability. This cluster technology is mainly used for scientific calculations, and is not intended to be introduced here, if interested can refer to relevant information.
2. High Availability cluster (HA Cluster)The cluster technology that reduces service downtime for the purpose of making the cluster's overall services as available as possible. If a node in the High-availability cluster fails, the other node will replace its work during that time. Of course, for other nodes, the load is correspondingly increased.
In order to improve the usability of the whole system, in addition to improving the reliability of each part of the computer, the cluster scheme will be adopted in general.
For this cluster scenario, there are generally two ways of working:
① Master-Master (active-active) mode of work
This is the most common cluster model, providing high availability and acceptable performance when only one node is available, which allows for maximum utilization of hardware resources. Each node provides resources to the client over the network, the capacity of each node is defined, the performance is optimal, and each node can temporarily take over the work of another node during failover. All services remain available after failover, but performance is typically reduced.

This is currently the most widely used dual-node dual-application active/active mode.

Applications that support the user's business are run on two nodes under normal conditions, each with their own resources, such as IP addresses, volumes on disk arrays, or file systems. When a system or resource fails on one side, the application and related resources are switched to each other's node.

The biggest advantage of this pattern is that there will be no "idle" servers, and both servers are working under normal conditions. However, if a failure occurs, the application will be run on the same server, because the server's processing power may not meet the peak requirements of both the database and the application, which will result in insufficient processing capacity, reduce the level of business response.
② Master-from (Active-standby) mode of work
To provide maximum availability and minimal performance impact, master-work requires a node that is in standby during normal work, the primary node handles the client's request, and the standby node is idle, and when the primary node fails, the standby node takes over the work of the master node and continues to serve the client. And will not have any performance impact.

The two-node Active/standby model is the simplest of HA, with two servers forming a cluster through a dual heartbeat line. Application application combine optional system components such as external shared disk arrays, file systems, and floating IP addresses to make up a business-run environment.

PCL provides a fully redundant server configuration for this environment. The advantages and disadvantages of this model: disadvantages: Node2 in Node1 normal work is in the "idle" state, resulting in the waste of server resources. Advantages: When the Node1 failure, the NODE2 can completely take over the application, and can ensure the application of the operation of the processing capacity requirements. 3. Highly scalable clusters This refers to server clustering technology with load balancing policies (algorithms). A load-balanced cluster provides a more practical solution for enterprise requirements, which allows the load to be allocated as evenly as possible in a computer cluster. The need for balance may be the application processing load or network traffic load. This scenario is ideal for nodes running the same set of applications. Each node can handle a portion of the load and can dynamically allocate the load between nodes to achieve balance. The same is true for network traffic. Typically, a single node cannot be processed quickly for too large network traffic, which requires that traffic be sent to other nodes. It can also be optimized based on the different resources available on each node or on the special environment of the network.

Load Balancing clusters distribute networks or compute processing loads between multiple nodes according to a certain strategy (algorithm). Load balancing is based on the existing network architecture, which provides a cheap and efficient way to extend server bandwidth, increase throughput, and improve data processing capabilities while avoiding single points of failure. 4. Load Balancing Strategy

It has been said that the role of load balancing is to distribute the network or compute the processing load in accordance with a certain policy (algorithm) between multiple nodes. Load Balancing can be implemented using software and hardware. The general frame structure can refer to the following figure.

Multiple web nodes in the background have the same Web application, and the user's access request first enters the load-balanced allocation node (possibly software or hardware), which is reasonably allocated to a Web application node according to the load-balancing strategy (algorithm). The same content for each Web node is easy to do, so choosing a load-balancing strategy (algorithm) is a key issue. The equalization algorithm is specifically described below.

The role of Web load balancing is to distribute the request evenly to each node, it is a dynamic balance, through some tools real-time analysis of the data packet, master the network data flow status, the request is distributed. For different application environment (such as E-commerce Web site, it has a large computational load, such as network database applications, frequent reading and writing, server storage subsystem system is facing great pressure, such as video service applications, data transmission volume, network interface burden. ), the Equalization Strategy (algorithm) used is different. Therefore, the Equalization Strategy (algorithm) also has a variety of forms, the broad sense of load balancing can be set up a dedicated gateway, load balancer, but also through a number of proprietary software and protocols to achieve. In the OSI seven layer protocol model, the second (data link layer), third (network layer), fourth (transport layer), layer seventh (application layer) have corresponding load Balancing strategy (algorithm), the principle of load balancing on data link layer is to choose different path according to the destination MAC address of the packet. At the network layer, the data stream can be unblocked to multiple nodes by using the allocation method based on IP address, and the exchange (switch) of the transport layer and the application layer is a control mode based on the traffic flow, which can realize load balance.
At present, there are three kinds of algorithms based on load balancing: round robin (round-robin), minimum number of connections (least connections first), and fast response priority (faster Response precedence).
The ① algorithm is to assign the requests from the network to the nodes in the cluster to handle them sequentially.
② Minimum connection number algorithm is to set a register for each server in the cluster, record the current number of connections per server, and the load balancing system always selects the server assignment tasks with the fewest number of current connections. This is much better than the "round robin algorithm" because, in some situations, simple rotations cannot tell which node is less loaded, and perhaps the new job is assigned to a server that is already busy.
The ③ fast response priority algorithm allocates tasks according to the state of the nodes in the cluster (CPU, memory, etc.). This is difficult to do, in fact, so far, the use of this algorithm has little load-balancing system. Especially for hardware load balancing equipment, only in the TCP/IP protocol to do the work, it is almost impossible to deep into the server's processing system for monitoring. But it is the direction of future development.

The above is the load balance commonly used algorithm, based on the above load balancing algorithm usage, and divided into the following several:

1. DNS Polling
The first load balancing technology is implemented through DNS, in DNS for multiple addresses to configure the same name, so the client query this name will get one of the addresses, so that different customers access to different servers, to achieve load balancing purposes.

DNS load balancing is a simple and efficient method, but it does not differentiate between servers and the server's current state of operation. When you use DNS load balancing, you must try to ensure that different client computers can get a different address evenly. Because DNS data has a refresh time flag, once the time limit is exceeded, other DNS servers need to interact with the server to regain the address data, potentially obtaining different IP addresses. Therefore, in order to make the address can be randomly allocated, we should make the refresh time as short as possible, different local DNS servers can update the corresponding address, to achieve random access to the address, however, the expiration time is set too short, will make the DNS traffic greatly increased, resulting in additional network problems. Another problem with DNS load balancing is that once a server fails, the client computer that holds the failed server address will not be able to access the server, even if the DNS settings are modified in a timely manner or wait enough time (refresh time) to function.
2. Reverse Proxy Server
Using a proxy server, you can forward requests to internal servers, which can obviously improve the access speed of static Web pages. However, it is also possible to consider a technique that uses a proxy server to forward requests evenly to multiple servers to achieve load balancing purposes.

This proxy method is different from the ordinary proxy way, the standard proxy way is the customer uses the proxy to access multiple external servers, and this proxy way is to proxy multiple clients to access the internal server, therefore also is called the reverse proxy mode. While this task is not particularly complex to achieve, it is not easy to achieve because of the particularly high efficiency required.

The advantage of using reverse proxies is that you can combine load balancing with the proxy server's caching technology to provide useful performance. However, it also has some problems, the first is that each service must be dedicated to the development of a reverse proxy server, this is not an easy task.

Although the proxy server itself can achieve very high efficiency, but for each agent, the proxy server must maintain two connections, an external connection, an internal connection, so for particularly high connection requests, Proxy server load is very large. In the reverse proxy mode, an optimized load balancing strategy can be applied to provide services for each visit to the most idle internal server. However, as the number of concurrent connections increases, the load on the proxy server itself becomes very large, and finally the reverse proxy server itself becomes a service bottleneck.

3. Address Translation Gateway
A load-balanced address translation gateway that maps an external IP address to multiple internal IP addresses and dynamically uses one of the internal addresses for each TCP connection request achieves load balancing purposes. Many hardware vendors integrate this technology into their switches as a function of their layer fourth switching, typically by randomly choosing a load-balancing strategy based on the number of connections to the server or the response time. Because address translation is relatively close to the lower level of the network, it is possible to integrate it into a hardware device, which is typically a LAN switch.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More