"Classic must read" Web site architecture Evolution process, the Electronic business site upgrade dozen blame __web

Source: Internet
Author: User
Tags failover http redirect
"Classic must read" Web site architecture Evolution process, the Electronic business site upgrade hit the blame Preface

We take javaweb as an example to build a simple electrical business system to see how the system can evolve step by step.

The function of the system:

User module: User registration and management

Product module: Product display and management

Trading modules: Creating Transactions and managing

Stage One, stand-alone build Web site

At the beginning of the site, we often run all our programs and software on a single machine. At this point we use a container, such as Tomcat, Jetty, Jboos, and then use the Jsp/servlet technology directly, or use some open source frameworks such as Maven+spring+struct+hibernate, maven+spring+ Springmvc+mybatis finally select a database management system to store data, such as MySQL, SQL Server, Oracle, and then connect and manipulate the database through JDBC.


All of the above software is loaded on the same machine, the application ran up, is also a small system. At this point the system results are as follows:

Phase II, Application server and database separation

With the site on the line, the number of visits gradually increased, server load slowly increased, in the server is not overloaded, we should be ready to enhance the load capacity of the site. If our code level has been difficult to optimize, without improving the performance of a single machine, it is a good way to increase the machine, not only can effectively improve the system load capacity, but also high cost performance.


What is the added machine for? At this point we can split the database, the Web server, so that not only improve the load capacity of a single machine, but also improve the disaster-tolerant capability.


The architecture of the application server is separated from the database as shown in the following illustration:

Phase III, Application server cluster

As the number of accesses continues to increase, a single application server is unable to meet the requirements. Under the assumption that the database server is not under pressure, we can change the application server from one to two or more, and spread the user's request to different servers, thus increasing the load capacity. There is no direct interaction between multiple application servers, and they all rely on the database to provide services externally.


The famous failover software has keepalived,keepalived is a similar to Layer3, 4, 7 switch system software, he is not a specific software failover of the exclusive, but can be applied to a variety of software products. Keepalived with the Ipvsadm can also do load balancing, is a artifact.


We have added an application server for example, the added system structure is as follows:

As the system evolves here, the following four questions will appear:

Who is the user's request forwarded to the specific application server

What is the forwarding algorithm

How the application server returns the user's request

How to maintain the consistency of the session if the user does not have the same server every time they visit

Let's look at the solution to the problem:

1, the first problem is the load balancing problem, there are generally 5 kinds of solutions:

HTTP redirection. HTTP redirection is the request forwarding of the application tier. The user's request has actually been to the HTTP redirect Load Balancing server, the server according to the algorithm required user redirection, after the user received the redirection request, again request the real cluster

Advantages: Simple.

Disadvantage: poor performance.

DNS domain name resolution load balancing. DNS domain name resolution load balancing is when a user requests a DNS server to obtain the corresponding IP address of the domain name, the DNS server directly gives the server IP after the load is balanced.

Advantages: To DNS, do not need us to maintain load balancing server.

Disadvantage: When an application server is hung up, unable to notify DNS in time, and the control of DNS load balance in the domain Name Service provider, the website can not do more improvement and more powerful management.

Reverse proxy server. When the user's request arrives at the reverse proxy server (has reached the Web room), the reverse proxy server is forwarded to the specific server according to the algorithm. Commonly used Apache,nginx can act as a reverse proxy server.

Advantages: Simple deployment.

Disadvantage: Proxy server can become a bottleneck in performance, especially when uploading large files.

IP layer load Balancing. After the request reaches the load balancer, the load Balancer realizes the request forwarding by modifying the destination IP address of the request, so that the load is balanced.

Advantages: Better performance.

Disadvantage: The bandwidth of the load balancer becomes the bottleneck.

Data link layer load balancing. After the request arrives at the load balancer, the load balancer achieves load balancing by modifying the requested MAC address, which, unlike IP load balancing, returns directly to the customer after the request has been made to the server. Without the need to go through the load balancer.

2, the second problem is the cluster scheduling algorithm problem, the common scheduling algorithm has 10 kinds.

RR polling scheduling algorithm. As the name suggests, polls the distribution request.

Advantages: Simple to implement

Disadvantage: Do not consider the processing power of each server

WRR weighted scheduling algorithm. We set weights for each server weight, load-balancing scheduler according to the weight of the dispatch server, the server is called the number of times with the weight of the proportional.

Advantages: Considering the different processing capabilities of the server

SH Original address hash: Extract User IP, according to hash function to get a key, and then according to static mapping table, investigate the corresponding value, that is, target server IP. If the target machine is overloaded, it returns empty.

DH Target Address Hash: Ibid, only now extract is the IP of the destination address to do the hash.

Advantages: Both of the above algorithms can achieve the same user access to the same server.

LC Minimum Connection. Prioritize requests to servers with fewer connections.

Advantages: Make the load of each server in the cluster more evenly.

WLC weighted minimum connection. On the basis of the LC, add weights for each server. The algorithm is: (Active connection number *256+ inactive connection number) The weight, calculate the value of small server priority is selected.

Benefits: You can allocate requests based on the capabilities of the server. Sed the shortest expected delay. In fact, SED is similar to WLC, except that the number of inactive connections is not considered. The algorithm is: (Active connection number +1) *256÷ weight, the same computed value of a small server priority is selected.

NQ never line up. An improved SED algorithm. We think about what circumstances can "never queue", that is, the number of connections to the server is 0, then if there is a server connection number of 0, the equalizer directly forwarded the request to it, without having to go through the SED calculation.

LBLC minimal connections based on locality. The equalizer, based on the IP address of the requested destination, finds the server to which the IP address is recently used, forwards the request, and uses the least connection number algorithm if the server is overloaded.

LBLCR minimal connectivity with replication based on locality. The equalizer, based on the requested destination IP address, finds the most recently used "server group" for the IP address, noting that it is not a specific server, and then uses the minimum number of connections to pick out a specific server from the group and forward the request. If the server overload, then according to the minimum number of connections algorithm, in the cluster of non-server group servers, find a server out, join the server group, and then forward the request.

3, the third problem is cluster mode problem, generally 3 kinds of solutions:

NAT: The load balancer receives the user's request, forwards it to the specific server, the server handles the request and returns it to the equalizer, and the equalizer returns to the user again.

DR: The load balancer receives the user's request and forwards it to the specific server, and the server returns to the user directly after playing the request. Need system to support IP Tunneling protocol, it is difficult to cross platform.

TUN: Ditto, but no IP tunneling protocol, good Cross-platform, most of the system can support.

4, the fourth question is the session question, generally has 4 kinds of solutions:

Session Sticky. Session Sticky is to put the same user in a conversation in a request, are assigned to a fixed server, so we do not need to solve the problem of the server across servers, the common algorithm has the Ip_hash method, namely the two hashing algorithm mentioned above.

Advantages: Simple to implement.

Disadvantage: The session disappears when the application server restarts.

Session Replication. Session replication is the replication session in the cluster so that each server holds session data for all users.

Advantages: Reduce load balancing server pressure, do not need to implement ip_hasp algorithm to forward requests.

Disadvantage: When replicating the bandwidth overhead, the large amount of access to the session memory large and waste.

Session Data Central Storage: Session data centralized storage is the use of database to store session data, implementation of the session and application server decoupling.

Advantages: Compared with the session replication, the pressure on broadband and memory is much reduced among the clusters.

Disadvantage: You need to maintain the database where the session is stored.

Cookie Base:cookie Base is the session exists in the cookie, there is a browser to tell the application server my session is what, the same implementation of the session and application server decoupling.

Advantages: Simple to implement, basic maintenance-free.

Disadvantages: Cookie length limit, low security, broadband consumption.

  

It is worth mentioning that:

Nginx currently supports load balancing algorithms with WRR, SH (support for consistent hashing), fair (I think it boils down to LC). But as a equalizer, nginx can also be used as a static resource server.

Keepalived+ipvsadm is more powerful, currently supported algorithms are: RR, WRR, LC, WLC, LBLC, SH, dh

Keepalived support cluster modes are: NAT, DR, TUN

The nginx itself does not provide a solution for session synchronization, while Apache provides support for session sharing.

Well, after solving the above problems, the structure of the system is as follows:

Phase IV, database read and write separation

Above we always assume that the database load is normal, but with the increase in traffic, the database load is also slowly increasing. Then someone may immediately think of the same as the application server, the database of a two load balance can be. But for the database, it's not that simple. If we simply put the number

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.