Large Web site-load Balancing architecture

Source: Internet
Author: User

Large Web site-load Balancing architecture

Excerpt from: http://www.cnblogs.com/and/p/3366400.html

Load Balancer load Balancing is built on the existing network structure, it provides a cheap and effective transparent method to extend network equipment and server bandwidth, increase throughput, enhance network data processing ability, improve network flexibility and availability.

a powerful tool for large web site load Balancing
    • Global load Balancing System (GSLB)
    • Content cache System (CDN)
    • Server load Balancing System (SLB)
the basic process of DNS domain name resolution

Initial Load Balancing solution (DNS polling)

Advantages

    • Basically no cost, because often domain name Registrar of this analysis is free;
    • Easy deployment, in addition to simple network topology amplification, the new Web server as long as the addition of a public network IP can

Disadvantages

    • Health checks, if a server goes down, the DNS server is not known, and the access is still assigned to this server. Modify DNS records to take effect at least 3-4 hours, or even longer;
    • Uneven distribution, if the configuration between several Web servers, can withstand the pressure is different, but the DNS resolution allocation of access is evenly distributed. Unbalanced distribution of user groups results in an imbalance in DNS resolution.
    • Session hold, if it is a Web site that requires authentication, this is fatal if the software architecture is not modified, because DNS resolution cannot permanently assign authenticated users access to the same server. Although there is a certain local DNS cache, but it is difficult to ensure that during user access, the local DNS does not period, and re-query the server and point to the new server, the original server saved user information can not be brought to the new server, and may require re-authentication identity, Switch back and forth for a long time each server has a different user information, the server resources is also a waste.
Global Load Balancing System (GSLB)

Advantage

    • Data Center Redundancy Backup
    • Multi-site traffic optimization
    • Ensure user Experience

The principle of global load Balancing system (GSLB)

There are a lot of DNS check tools available online, and you can search for them.

Content Cache System (CDN)
    • Static acceleration of content cache System (CDN)
    • Dynamic acceleration of content cache System (CDN)

Features of dynamic acceleration

    • Smart Routing
    • Transmission Control Protocol (TCP) optimizations
    • HTTP Pre-load

Server load Balancing system

Application background

    • Rapid growth in Access traffic
    • The volume of business continues to increase

User Requirements

    • Expect 7x24 uninterrupted availability and faster system response times

load balancing must meet performance, scale, reliability

Server load Balancing system three kinds of access methods

Deployment method

Characteristics

Advantages

Disadvantages

Inline route mode

more common deployment methods

  • load Balancer device effectively isolates the server, security best considered
  • server gateways point to load balancer devices,   features are easier to achieve and maximize load balancing performance
  • server can receive real access to the source client IP address directly
  • large changes to existing topology
  • need to consider whether the intranet server has external access requirements, if necessary to set up static NAT translation

single-arm mode

most common deployment methods

  • Easy deployment, small changes to existing topology
  • and app-independent flows The volume will not be affected by the load Balancer device
  • internal application no impact, external applications typically require a front-end firewall to do NAT mapping to the app VIP
  • server cannot receive access to customer source address directly, need to make changes to the application before you can get the real access address by other means

Dsr

The server backhaul message is not directly returned to the client through the load Balancer device;

Short delay, suitable for streaming media and other applications with high latency requirements

  • High performance with high throughput
  • The server can receive the real Access source client IP address directly
  • can only do 4-tier load balancing, 7-tier service cannot be optimized (such as compression, etc.) cannot be used
  • Need to configure the loopback address on the server

A common scheduling algorithm for server load Balancing system
    • Polling (Round Robin)
    • Weighted polling (Weighted Round Robin)
    • Minimum connection (Least Connections)
    • Weighted minimum connection (Weighted Least Connections)
Health Check

The purpose of the Health check algorithm is to check the health of the real server in the server farm through some kind of probe mechanism, and to avoid distributing the client's request to the failed server to improve the HA capability of the business.

Currently common health Check algorithms:

    • Ping (ICMP)
    • Tcp
    • HTTP
    • Ftp
System acceleration

Optimized function-ssl acceleration

optimization function-http compression

HTTP compression is a way to transfer compressed text content between a Web server and a browser. F5 HTTP compression technology reduces application delivery time and optimizes bandwidth through the intelligent compression capability of big-IP systems. HTTP compression uses a common compression algorithm to compress HTML, JavaScript, or CSS files. The greatest benefit of compression is to reduce the amount of data transmitted over the network, thereby increasing the access speed of the client browser.

optimization Function-Connection multiplexing

optimization function-TCP Cache

Session hold session Hold-client source IP session persisted

The source IP address session hold is the connection to the same source IP address or the request is considered the same user, according to the session retention policy, within the duration of the session remains, the connection/request from the same source IP address is forwarded to the same server.

Session hold-cookie session hold

When a session based on a source address is kept out of load-sharing, for example, the source IP address of the client initiating the connection request is relatively fixed, such problems can usually take place based on the application-level session hold, the cookie is usually present in the HTTP header, and today's HTTP-based applications are widely used, Therefore, cookie-based sessions remain more and more present in server load balancing solutions.

Limitations:

For non-HTTP protocols, or if the client disables cookies, it is not valid.

Session remains-url hash (hash) session

A basic concept of hash session persistence is to choose to assign the request to that server according to a hash factor, based on the result of the factor and how many servers are in the background. The feature of hash sessions is that each particular hash factor assigned to the server is fixed when the health state of the backend server does not change. Its biggest advantage is that the hash session can be maintained without a session hold table, but only based on the results of the calculation to determine the server is assigned to, especially in some sessions to keep table query cost is much larger than the hash calculation overhead, the use of hash session to maintain can improve the system's processing power and response speed.

 URL Hash Session keeping is usually for the background with the cache server scenario , hash for the URL, the same URL to the request assigned to the same cache server, in the background of the cache server farm, The content stored on each cache server is different, increasing the utilization of the cache server.

Failure Case Analysis Q&a case Study (1)-Loop jump

Failure phenomena:

Web service side of the user access to determine the URL, for non-HTTPS requests, redirected to the HTTP site, resulting in the user has been 302 jump.

Cause Analysis:

With the Load Balancer SSL acceleration feature, the server sees all user requests coming from HTTP.

Solution:

All stations enable SSL acceleration.

Q&a Case Study (2)-User session Lost

Failure phenomena:

The user submits data to the same domain's HTTPS site on the HTTP site, and the Web program throws a missing session exception, and the user submits the data failed.

Cause Analysis:

HTTP and HTTPS are considered to be 2 separate services on the load Balancer device, resulting in 2 separate TCP links that hit different real servers, causing the session to be lost.

Solution:

Enable live server-based session retention on the load balancer device.

Q&a Case Study (3)-Client source IP not being taken

Failure phenomena:

The server does not get the IP address of the user extranet, and sees a large number of IP addresses from the specific network segment of the intranet.

Cause Analysis:

The Load Balancer device enables the user source address translation (SNAT) mode and modifies the user source IP in the TCP message.

Solution:

The load Balancer device overwrites the X-forwarded-for value with the user's extranet IP, and the server obtains the X-forwarded-for value of the request header header in the HTTP protocol as the user source IP. The IIS logs display the user source IP by installing the plug-in form.

Server load Balancing Equipment selection 1. Price Factors
Hardware devices: F5, Citrix, Redware, A10
Software: LVS, Nginx, Haproxy, Zen loadbalance 2. Performance
4/7-Layer Throughput (unit bps)
4/7-tier New connection number (unit CPS)
Number of concurrent connections
Functional Module Performance metrics (SSL acceleration, HTTP compression, memory cache) 3. Meet Real and future needs
1) If you confirm that the load Balancer device handles all applications as the simplest 4-tier processing, the 4-tier performance of the theoretically selected load Balancer device is slightly higher than the actual performance requirement.
2) If you confirm that the load Balancer device handles all applications in a simple 7-tier process, the 7-tier performance of the theoretically selected load Balancer device is slightly higher than the actual performance requirement.
3) If the load balancer device handles applications ranging from 4 layers to 7 layers, it is recommended to consider load balancing devices in accordance with the performance of the 7 tier application.
4) If you confirm that your application is load balanced processing, requires a complex 4 layer or 7 layer processing, such as the need to be based on the client's address for strategic distribution, needs to be processed according to TCP content, according to the HTTP header or HTTP message processing, then the recommended Load Balancer device 4/ The 7-layer performance is twice times more realistic than it needs to be.
5) If the load Balancer device has mixed complex traffic processing and some function modules are also turned on, the recommended Load Balancer device 4/7 layer performance is 3 times times the requirement of authenticity.
6) Considering that the equipment needs light load operation to be more stable, it is possible to add 30% more performance on the above basis.
7) If you want to meet the development needs of the next few years, on the basis of the above should be set aside for future development needs to increase the performance.
8) Different load balancing equipment manufacturers due to different architectures, so that some equipment in the complex environment may also be better performance, the customer can compare judgment, but overall, the above recommendations for all manufacturers of equipment.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.