Selection and implementation of load Balancing scheme for Web Cluster service

Source: Internet
Author: User
Tags contains header domain domain name domain name server client access advantage
The Web cluster system, a cluster of servers running the same Web application at the same time, looks like a server to the outside world. In order to balance the load of cluster server and to optimize the performance of the system, the cluster server distributed many requests to different nodes in the system. Thus achieving greater effectiveness and stability, which is what the web-based enterprise application must have.
  
High reliability can be seen as a redundancy setting for the system. For a specific request, if the requested server can not be processed, then the other server can be effective processing it? For an efficient system, if a Web server fails, the other server can immediately replace its location, the requested request processing, and this process for users, to be as transparent as possible, so that users do not perceive!
  
Stability determines whether an application can support a growing number of user requests, which is a capability of the application itself. Stability is an effective measure of many factors affecting system performance, including the maximum number of users that can be supported by the cluster system and the time required to process a request.
  
Among the many existing methods of balancing server load, the following two methods are widely studied and used:
  
DNS Load Balancing Method Rr-dns (round-robin Domain Name System)
Load Balancer
Below, we will discuss the two approaches.
  
DNS rotational scheduling Rr-dns (round-robin Domain Name System)
  
The data file in the domain name server, which maps the host name to its IP address. When you type a URL in the browser (for example: www.loadbalancedsite.com), the browser sends the request to DNS, asking it to return the IP address of the corresponding site, which is called a DNS query. When the browser obtains the IP address of the site, the IP address is connected to the site to be visited, and the page is displayed in front of the user.
  
A domain name server (DNS) typically contains a single IP address and a list of the names of the sites that the IP address maps to. In our hypothetical example above, www.loadbalancedsite.com's mapping IP address for this site is 203.24.23.3.
  
In order to take advantage of the workload of the DNS equalization server, for the same site, there are several different IP addresses in the DNS server at the same time. These several IP addresses represent the different machines in the cluster and logically map to the same site name. By using our example to better understand this, www.loadbalancedsite.com will be posted to three machines in a cluster via the following three IP addresses:
  
203.34.23.3
  
203.34.23.4
  
203.34.23.5
  
In this example, the DNS server contains the following mapping table:
  
Www.loadbalancedsite.com 203.34.23.3
  
Www.loadbalancedsite.com 203.34.23.4
  
Www.loadbalancedsite.com 203.34.23.5
  
When the first request arrives at the DNS server, the IP address of the first machine is returned 203.34.23.3, and when the second request arrives, it returns the IP address 203.34.23.4 of the second machine, and so on. When the fourth request arrives, the IP address of the first machine is returned again, and the loop is invoked.
  
With the above DNS Round Robin technology, all requests for a particular site are distributed evenly to the machines in the group. Therefore, in this technique, all nodes in the cluster are visible to the network.
  
Advantages of DNS Rotation scheduling
  
The biggest advantage of DNS Round Robin is ease of implementation and low cost:
  
Low cost and easy to build. To support alternate scheduling, system administrators only need to make some changes on the DNS server, and this functionality has been added to many newer versions of DNS servers. For Web applications, there is no need to make any changes to the code; In fact, the Web application itself is not aware of the load-balancing configuration even before it.
Simple. No network experts are required to set it up or to maintain it if there is a problem.
Disadvantages of a DNS rotation schedule
  
There are two main deficiencies in this software based load balancing method, one is to support the association during the service period, and the other is not high reliability.
  
• Consistency between servers is not supported. Server consistency is a load balancing system should have the ability, through which the system can be based on the session information is the server side, or the underlying database level, and then the user's request to the corresponding server. DNS rotation scheduling does not have this intelligent feature. It is a similar judgment in the form of cookies, hidden fields, and rewrite URLs in three ways. After the user has established a connection to the server through the text-based flag method above, all subsequent accesses are connected to the same server. The problem is that the server's IP is temporarily stored in the cache by the browser, and once the record expires, the connection needs to be established, so that the same user's request is likely to be handled by a different server, and all previous session information is lost.
  
High reliability is not supported. Imagine a cluster with n nodes. If one of the nodes is destroyed, then all requests to access the node will not be answered, which no one wants to see. The more advanced routers can check the nodes at certain intervals, and if there are damaged nodes, then remove them from the list to solve the problem. However, because on the Internet, ISPs hosts a large number of DNS in the cache to conserve access time, DNS updates become so slow that some users may visit sites that no longer exist, or some new sites will not be accessed. Therefore, although the DNS rotation scheduling solves the load balancing problem to some extent, the change of this condition is not very optimistic and effective.
In addition to the alternate scheduling method described above, there are three ways to allocate DNS load balancing, listing these four methods as follows:
  
Øround Robin (RRS): Assigning work averages to servers (consistent with actual service host performance)
  
Øleast-connections (LCS): More work is assigned to fewer connected servers (the Ipvs table stores all active connections.) For actual service host performance consistency. )
  
Øweighted round robin (WRRS): Assign more work to a larger-capacity server. Can be adjusted up or down dynamically according to the load information. (For actual service host performance inconsistencies)
  
øweighted least-connections (WLC): Consider their capacity to assign more work to fewer connected servers. The capacity is indicated by the weights specified by the user, and can be adjusted up or down dynamically according to the load information. (For actual service host performance inconsistencies)
  
  
  
Load Balancer
  
Through the virtual IP address method, the load balancer solves many problems which are faced by rotation scheduling. Using a Load Balancer cluster system that looks like a single server with an IP address, of course, this IP address is virtual and maps the address of each machine in the cluster. So, in a way, the load balancer is the entire cluster's IP address to the external network.
  
When the request arrives at the load balancer, it overrides the header file for the request and assigns it to the machine in the cluster. If a machine is removed from the cluster, the request is not sent to a server that does not already exist, because all machines have the same IP address on the surface, and the address does not change even if a node in the cluster is removed. Also, cached DNS entries on the Internet are no longer a problem. When a reply is returned, the client sees only the results returned from the load balancer. In other words, the object of the client operation is the load balancer, and for its more backend operations, it is completely transparent to the client.
  
Advantages of Load Balancer
  
• Server consistency. The load balancer reads the cookies or URL explanations contained in each request sent by the client. Based on the information read, the load balancer can rewrite the header and send the request to the appropriate node in the cluster that maintains the session information requested by the corresponding client. In HTTP communications, load balancers provide server consistency, but do not provide this service in a secure way (for example, HTTPS). When the message is encrypted (SSL), the load Balancer cannot read the session information that is hidden in it.
  
• High reliability through failure recovery mechanisms. Failback occurs when a node in a cluster cannot process a request and redirect the request to another node. There are two main types of failure recovery:
  
• Request-level failure recovery. When a node in a cluster cannot process a request (usually due to a down machine), the request is sent to another node. Of course, the session information saved on the original node will be lost while it is being directed to the other node.
  
• Transparent session failure recovery. When a reference fails, the load balancer sends it to the other nodes in the cluster to complete the operation, which is transparent to the user. Because transparent session recovery requires the node to have appropriate operational information, all nodes in the cluster must have a common storage area or a common database to store session information data to provide the operational information that each node needs to perform a separate process session failure recovery.
  
• Statistical measurement. Since all Web application requests must go through a load-balancing system, the system can determine the number of active sessions, the number of active sessions in any instance access, the number of responses, the number of peak loads, and sessions during peak and trough periods, and more. All of these statistics can be well tuned to the performance of the entire system.
  
Disadvantages of Load Balancer
  
The disadvantage of hardware routing is cost, complexity, and single point failure. Since all requests are passed through a single hardware load balancer, any failure on the load balancer will cause the entire site to crash.
  
Load balancing for HTTPS requests
  
As mentioned above, it is difficult to load balance and session information maintenance on requests from HTTPS. Because the information in these requests has been encrypted. Load balancers do not have the ability to handle such requests. However, there are two ways to solve this problem:
  
Proxy network server
Hardware SSL Decoder
Before the proxy server is clustered, it accepts all requests and decrypts them, and then sends the processed requests back to the appropriate nodes based on the header information, which does not require hardware support, but adds additional burden to the proxy server.
  
The hardware SSL decoder is decrypted by the request before it reaches the load balancer. This approach is faster than a proxy server's processing speed. But the cost is high, and the implementation is more complex.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.