Alibaba P9 architect explains the evolution process from a single server to a large website with hundreds of millions of traffic, with hundreds of millions of p9
Phase 1: website construction by a single organization
In the early stages of the website, we often run all our programs and software on a single machine. In this case, we use a container, such as tomcat, jetty, and jboos, and then directly use the JSP/servlet technology, alternatively, you can use some open-source frameworks such as maven + spring + struct + hibernate, maven + spring + springmvc + mybatis. Finally, you can select a database management system to store data, for example, mysql, SQL Server, and oracle are connected and operated through JDBC.
Load all the above software on the same machine, and the application runs, it is also a small system. The system result is as follows:
Stage 2: Application Server and database Separation
As the website goes online, the traffic volume increases gradually, and the server load increases slowly. When the server is not overloaded, we should be prepared to improve the load capacity of the website. If it is difficult to optimize the code, increasing the number of machines without improving the performance of a single machine is a good method, which not only effectively improves the load capacity of the system, and cost-effective.
What are the added machines used? In this case, we can split the database and web server, which not only improves the load capacity of a single machine, but also improves the disaster tolerance capability.
The architecture after the application server and the database are separated is shown in:
Phase 3: Application Server Cluster
As the access volume continues to increase, a single application server is no longer able to meet its needs. Assuming that there is no pressure on the database server, we can change the application server from one server to two or more, and distribute user requests to different servers to improve the load capacity. There is no direct interaction between multiple application servers. They all rely on their own databases to provide external services. The well-known software for failover is keepalived. keepalived is a software similar to the layer3, 4, and 7 switching mechanisms. It is not a specialized product of a Specific Software Failover, it can be applied to a variety of software products. Keepalived can be used together with ipvsadm for load balancing.
Taking adding an application server as an example, the added system structure is as follows:
As the system evolves here, the following four problems will occur:
Who will forward user requests to the specific application server?
What are the forwarding algorithms?
How does the application server return user requests?
If you access different servers each time, how can you maintain session consistency?
Let's take a look at the solution to the problem:
1. The first problem is the problem of Server Load balancer. Generally, there are five solutions:
1. http redirection. HTTP redirection refers to request forwarding at the application layer. The user's request has actually been redirected to the HTTP Server Load balancer server. The server requires the user to redirect according to the algorithm. After receiving the redirect request, the user requests the real cluster again.
Advantage: simple.
Disadvantage: poor performance.
2. DNS domain name resolution Server Load balancer. When a user requests a DNS server to obtain the IP address corresponding to the domain name, the DNS server directly gives the Server IP address after the Server Load balancer.
Advantage: you do not need to maintain the Server Load balancer instance.
Disadvantage: When an application server fails, the DNS cannot be notified in time, and the DNS Server Load balancer has control over the Domain Name Service Provider, the website cannot be improved and managed more effectively.
3. reverse proxy server. When a user's request arrives at the reverse proxy server (the website data center has been reached), the reverse proxy server forwards the request to the specific server according to the algorithm. Common apache and nginx can act as reverse proxy servers.
Advantage: easy deployment.
Disadvantage: the proxy server may become a performance bottleneck, especially when uploading large files at a time.
4. IP layer Server Load balancer. After a request arrives at the Server Load balancer, the Server Load balancer modifies the target IP address of the request to forward the request and achieve load balancing.
Advantage: better performance.
Disadvantage: The bandwidth of the Server Load balancer has become a bottleneck.
5. Load Balancing at the data link layer. After a request arrives at the Server Load balancer, the Server Load balancer performs Load Balancing by modifying the mac address of the request. Unlike the IP Server Load balancer, the Server Load balancer directly returns the request to the customer after the request has accessed the server. Without going through the Server Load balancer.
2. The second problem is the cluster scheduling algorithm problem. There are 10 common scheduling algorithms.
1. rr round-robin scheduling algorithm. As the name implies, requests are distributed by polling.
Advantages: easy to implement
Disadvantage: The processing capability of each server is not considered.
2. wrr weighted scheduling algorithm. We set the weight value weight for each server. The Server Load balancer scheduler schedules the server based on the weight. The number of times the server is called is proportional to the weight.
Advantage: Different server processing capabilities are taken into account.
3. sh original address hash: extracts the user IP address and generates a key based on the hash function. Then, based on the static ing table, investigate the corresponding value, that is, the target server IP address. If the target machine is overloaded, null is returned.
4. dh target address hash: Same as above, but the IP address of the target address is extracted for hashing.
Advantage: the above two algorithms allow the same user to access the same server.
5. Least lc connections. Forward requests to servers with fewer connections first.
Advantage: the load on each server in the cluster is more even.
6. wlc weighted least join. Add weights to each server based on lc. The algorithm is as follows: (number of active connections x 256 + number of inactive connections) weight. servers with a small value are preferentially selected.
Advantage: requests can be allocated based on the server's capabilities.
7. The shortest expected latency of sed. In fact, sed is similar to wlc. The difference is that the number of inactive connections is not considered. The algorithm is as follows: (active connections + 1) * 256 weight. servers with a smaller value are preferentially selected.
8. nq never queues. Improved sed algorithm. When the server connection count is 0, the balancer forwards the request directly to the server if the server connection count is 0, sed computing is not required.
9. LBLC is based on local least join. Based on the target IP address of the request, the balancer finds the server that recently used the IP address and forwards the request. If the server is overloaded, the least connections algorithm is used.
10. LBLCR local-based least join with replication. Based on the target IP address of the request, the balancer finds the recently used "server group" of this IP address. Note that this is not a specific server, then, the minimum number of connections is used to pick out a specific server from the group and forward the request. If the server is overloaded, find a server in a cluster that is not in this server group based on the least connections algorithm, add it to this server group, and then forward the request.
3. The third problem is the cluster mode problem. Generally, there are three solutions:
1. NAT: the Server Load balancer receives user requests and forwards them to a specific server. After processing the requests, the server returns them to the Server Load balancer and then returns them to the user.
2. DR: the Server Load balancer receives user requests and forwards them to a specific server. The server directly returns the requests to the user. The system must support the IP Tunneling protocol, making it difficult to cross-platform.
3. TUN: Same as above, but it does not require the IP Tunneling protocol. It has good cross-platform performance and is supported by most systems.
4. The fourth problem is the session problem. Generally, there are four solutions:
1. Session Sticky. Session sticky is to allocate requests from the same user in a session to a fixed server, so that we do not need to solve the cross-Server session problem, common algorithms include the ip_hash method, that is, the two hash algorithms mentioned above.
Advantages: easy to implement.
Disadvantage: when the application server is restarted, the session disappears.
2. Session Replication. Session replication copies sessions in the cluster so that each server stores session data of all users.
Advantage: reduces the load balancing Server Load balancer requires no ip_hasp algorithm to forward requests.
Disadvantage: The bandwidth overhead is high during replication. If the traffic volume is large, the session occupies a large amount of memory and is wasted.
3. Centralized Session data storage: session data is stored in a database to decouple sessions from application servers.
Advantage: compared with the session replication solution, the stress on bandwidth and memory between clusters is much reduced.
Disadvantage: the database storing session needs to be maintained.
4. Cookie Base: the cookie base stores the session in the cookie. A browser is used to tell the application server what the session is, which also decouples the session from the application server.
Advantages: simple and maintenance-free.
Disadvantages: cookie length restriction, low security, and high bandwidth consumption.
It is worth mentioning that:
Currently, nginx supports the following load balancing algorithms: wrr, sh (Consistent hashing supported), and fair (which can be attributed to lc ). However, nginx can also serve as a static resource server.
Keepalived + ipvsadm is relatively powerful. Currently, supported algorithms include rr, wrr, lc, wlc, lblc, sh, and dh.
Keepalived supports the following cluster modes: NAT, DR, and TUN.
Nginx does not provide a solution for session synchronization, while apache provides support for session sharing.
We recommend a Java advanced technology group: 619881427. All the architecture technologies used in this article will be shared in the group and can be downloaded for free. You can add an ape that is interested in learning.
After solving the above problems, the system structure is as follows:
Stage 4: Database read/write splitting
We always assume that the database load is normal, but as the access traffic increases, the database load is also growing. Then someone may immediately think of the same database as the application server, and then SLB. But it is not that simple for databases. If we simply split the database into two parts and load the database requests to machine A and machine B respectively, it will obviously cause data inconsistency between the two databases. In this case, we can first consider using the read/write splitting method.
The structure of the database system after read/write splitting is as follows:
This structure change will also cause two problems:
Data synchronization between master and slave Databases
Application Selection of data sources
Solution:
We can use the MYSQL master + slave method to implement master-slave replication.
Use third-party database middleware, such as mycat. Mycat evolved from cobar, and cobar was Alibaba's open-source database middleware, which was then discontinued. Mycat is a good mysql open-source database sharding middleware in China.
Phase 5: use search engines to relieve the pressure on reading Databases
When a database reads a database, fuzzy search is often unable to cope with the problem. Even if read/write splitting is performed, this problem cannot be solved. Taking the transaction website as an example, the published products are stored in the database. The most common function that users use is to find the products, especially the corresponding products based on the product title. This requirement is generally implemented through the like function, but it costs a lot. In this case, we can use the inverted index of the search engine.
Search engines have the following advantages:
It can greatly improve the query speed.
After a search engine is introduced, the following overhead is also introduced:
It brings about a lot of maintenance work. We need to build indexes on our own, and design full/increased construction methods to meet non-real-time and Real-Time query requirements.
Search engine cluster maintenance required
The search engine cannot replace the database. It solves the "read" problem in some scenarios. whether to introduce the search engine requires comprehensive consideration of the requirements of the entire system. The system structure after the search engine is introduced is as follows:
Stage 6: Use cache to relieve the pressure on reading Databases
1. cache at the background application layer and database layer
As the number of visits increases, many users access the same part of the content. It is unnecessary to read these popular content from the database every time. We can use cache technology. For example, we can use google's open source Cache Technology guava or memcacahe as the application layer cache, or redis as the database layer cache.
In addition, in some scenarios, relational databases are not very suitable. For example, if I want to set a "daily password input error count limit" function, the idea is that when a user logs on, if a logon error occurs, the user's IP address and number of errors are recorded. Where should this data be stored? If it is stored in the memory, it will obviously occupy too much content. If it is stored in a relational database, it is necessary to create a database table, resume the corresponding java bean, and write SQL. The data we want to store is nothing more than key: value data similar to {ip: errorNumber. For such data, we can use NOSQL databases to replace traditional relational databases.
2. Page Cache
In addition to data caching, there are also page caching. For example, use localstroage or cookie of HTML5.
Advantages:
Reduce database pressure
Greatly improves access speed
Disadvantages:
Need to maintain the Cache Server
Improved encoding complexity
It is worth mentioning that:
The scheduling algorithm of the cache cluster is different from the application server and database mentioned above. We recommend that you use the consistent hash algorithm to increase the hit rate. If you are interested, you can refer to the relevant materials.
Structure after the cache is added:
Phase 7: horizontal and vertical database splitting
Since our website evolved, transaction, product, and user data are still in the same database. Although the method of adding cache and read/write splitting is adopted, as the pressure on the database continues to increase, the bottleneck of the database becomes more and more prominent. At this time, we can have two options: vertical data split and horizontal split.
We recommend a Java advanced technology group: 619881427. All the architecture technologies used in this article will be shared in the group and can be downloaded for free. You can add an ape that is interested in learning.
7.1 vertical data split
Vertical Split means to split different business data in the database into different databases. Combined with the current example, data of transactions, commodities, and users is separated.
Advantages:
Solved the problem of putting all services in one database.
More optimizations can be made based on business characteristics.
Disadvantages:
Multiple databases need to be maintained
Problem:
Consider the previous cross-business transactions
Cross-database join
Solution:
We should try to avoid cross-database transactions at the application layer. If cross-database transactions are required, we should try to control them in the code.
We can solve this problem through third-party applications. As mentioned above, mycat provides a wide range of cross-database join solutions. For details, refer to the official mycat documentation.
The structure after vertical split is as follows:
7.2 horizontal data split
Horizontal data splitting refers to splitting data in the same table into two or more databases. The reason for horizontal data split is that the data volume or update volume of a service reaches the bottleneck of a single database. In this case, you can split the table into two or more databases.
Advantages:
If we can solve the above problems, we will be able to handle the increasing data volume and write volume well.
Problem:
The Application System Accessing user information needs to solve the SQL routing problem, because the user information is now divided into two databases, and you need to know where the data to be operated is during data operations.
The processing of the primary key also becomes different. For example, the original field from the increment field cannot be used simply now.
If paging is required, it will be troublesome.
Solution:
We can still solve third-party middleware, such as mycat. Mycat can use the SQL parsing module to parse our SQL statements and then forward requests to a specific database based on our configuration.
We can use UUID to ensure a unique or custom ID solution.
Mycat also provides a wide range of paging queries, such as querying by page from each database, and then merging data for querying by page.
Structure after horizontal split of data:
Phase 8: Application splitting
8.1 split applications
With the development of business, more and more businesses, and more applications. We need to consider how to avoid application bloated. In this case, you need to split the application from one application to two or more. In our example above, we can split users, commodities, and transactions. It becomes two subsystems: "user, product" and "user, transaction.
Split Structure:
Problem:
After this split, there may be some same code, such as user-related code, products and transactions require user information, so the code for similar user information operations is retained in both systems. How to ensure that the code can be reused is a problem to be solved.
Solution:
Solve the problem by taking the service-based route
8.2 service-oriented
To solve the problems arising after the application is split, we split public services to form a service-oriented model, SOA for short.
The service-oriented system structure is adopted:
Advantages:
The same Code will not be scattered across different applications. These implementations are placed in various service centers to better maintain the code.
We put the interaction between databases in various service centers, so that "front-end" web applications focus more on interaction with browsers.
We recommend a Java advanced technology group: 619881427. All the architecture technologies used in this article will be shared in the group and can be downloaded for free. You can add an ape that is interested in learning.
Problem:
How to make remote service calls
Solution:
We can solve this problem by introducing message-oriented middleware below.
Phase 9: introduce message-oriented Middleware
As the website continues to grow, sub-modules developed in different languages and subsystems deployed on different platforms may appear in our system. In this case, we need a platform to transmit reliable data that is irrelevant to the platform and language. In addition, we can make Server Load balancer transparent and collect and analyze call data during the call process, predict the website's access growth rate and other requirements, and predict how the website should grow. Open-source message middleware has Alibaba's dubbo, which can be used with Google's Open-Source Distributed Program Coordination Service zookeeper for Server Registration and discovery.
Structure after message-oriented middleware is introduced:
10. Summary
The above evolution process is just an example. It is not suitable for all websites. In reality, the website evolution process is closely related to its own business and different problems, and there is no fixed pattern. Only by carefully analyzing and constantly exploring can you find the architecture suitable for your website.