Large Web site technology Architecture (1)

Last Update:2016-05-06 Source: Internet

Author: User

Tags server memory

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sites are small sites step by step into a large web site, and this challenge is mainly from the huge user, the security environment is bad, high concurrent access and massive data, any simple business processing, once need to deal with the number of P-meter data and face hundreds of millions of users, the problem will become tricky

Let's talk about the evolution of this process:

Initial stage

Large Web sites are made up of small websites, and the site architecture is the same

Small sites are not accessed by many people at first, and are more than enough to have a single server, just like this:

applications, databases, files, and all of the resources are on one server, usually Linux PHP MySQL Apache can be used to complete the project deployment, and then buy a domain name, rent a cheap server can start our website tour

Separation of application services from data Services

With the development of the business, a gradual server has been unable to meet the demand, then we can应用与数据分离

After separation we use three servers: the application server, the file server, and the database server, as follows:

The requirements for these three servers are different:

应用服务器To handle a lot of business logic, you need better faster and more powerful CPUs

数据库服务器Requires fast disk retrieval and data caching, requiring faster hard drives and larger memory

文件服务器Need to store user-uploaded file resources, so larger hard disk storage space is required

After the separation of applications and data, the responsibilities become more exclusive, the performance of the site is further improved, but as users continue to grow, we need to further optimize the site architecture

Using caching to improve performance

Access to the site follows the 28 law: 80% of Business access is focused on 20% of the data

Therefore, we need to cache this small amount of data to reduce the access pressure of the database to improve the data access speed of the whole website and improve the reading and writing performance of the database.

The caching of Web sites can be divided into two types: local caches cached on the application server and remote caches on dedicated distributed cache servers

本地缓存can be accessed faster, but is limited by the application server memory, the amount of cache data is restricted, and memory contention occurs

远程分布式缓存You can use clustering to deploy large memory servers as a dedicated cache server, and you can theoretically do caching services that are not limited by memory capacity

As shown below:

With the use of caching, data access pressure is effectively mitigated, but a single application server can handle a limited number of request connections, at the peak of the visit, the application server will become the site performance bottleneck

Improve Web site concurrency with Application server clusters

Using a cluster is a common means of solving high concurrency, massive data problems, and when you're vertically lifted to a certain level, it's time to start horizontally.

When the processing power of a server is insufficient, instead of replacing a more powerful server, it is better to add a server to share the original server pressure. For large web sites, no matter how powerful the server, can not meet the continuous growth of business needs, more efficient way is to increase the server to share the pressure

For the site architecture, if adding a new server can improve load pressure, then you can use the same way to deal with the flow of business requirements, so as to achieve the scalability of the system

Load Balancer Dispatch server, can distribute user request to any server in the application server cluster, if more users, can add more application server, make application server load pressure no longer become the performance problem of the website

Database read/write separation

After using the cache, most operations can be done without database access, but there are still some read operations (cache access misses, cache expiration) and all write operations need to access the database, when the site's user volume reaches a certain time, the database load problem comes

Currently most databases support master-slave hot backup, by configuring the master-slave relationship between the two servers, you can synchronize data updates from one database server to another, and the website uses this function to realize database read and write separation, thus further improving the database load pressure

When the application server is writing, it accesses the primary database, and the primary database updates the data synchronously to the slave database through the master-slave replication mechanism, so that when the application server reads, it can access the data from the database.

Accelerate site response with reverse proxy and CDN

CDNand 反向代理 the basic principle of all is cache

CDNDeployed in the computer room of the network provider, the user obtains the data from the closest network supplier's room when making the request.

反向代理is deployed in the central room, when the user requests to reach the central room, will first access the reverse proxy server, if the reverse proxy server caches the resources requested by this user, it is returned directly to the user

Use CDN and 反向代理 all are to return to user data as soon as possible, on the one hand speed up user access speed, on the other hand also reduce the pressure of back-end server

Using Distributed file systems and distributed database systems

With the continued development of the website business, this time can be like Distributed Application Server, the database system and file system distributed management

分布式数据库Is the last means of the site database splitting, generally we can take the business sub-Library, according to different business databases deployed on different database servers

Using NoSQL and search engines

Both methods rely on the technical means of the Internet, the application server through a unified data access module to access a variety of data, thereby reducing the application has multiple data sources of trouble

Business Split

For large sites, we can divide and conquer, the entire business of the site into different modules, such as large-scale transaction shopping can be divided into the home page, shops, orders, buyers, respectively, to different business teams responsible for

At the same time we split a Web site into multiple applications based on module partitioning, each application is deployed and maintained separately, the application is linked through hyperlinks (pointing to different application addresses), and finally through the same data storage system to form an interconnected complete system

Distributed services

With the business splitting, the whole system is getting bigger, the overall complexity of application increases exponentially, the deployment maintenance becomes more and more difficult, and all the application servers are connected with the database service, in the case of tens of thousands of server scale, the number of these connections is the square of server scale, which leads to insufficient resources

At this time, the same business will be extracted, independent deployment, these reusable business and connected databases, etc., extracted as a public service, and application system only through the distribution of services to access public business services to complete business operations

Here, basically most of the technical problems can be solved, there are some real-time synchronization and other specific business problems can be solved through the existing technology

Large Web site technology Architecture (1)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More