Technical framework for large Web sites

Source: Internet
Author: User
Tags server memory

The site is a step-by-step development of a small site to a large site, and the challenge is mainly from huge users, bad security environment, high concurrent access and massive data, any simple business processing, once the need to deal with the number of P-meter data and the face of hundreds of millions of users, the problem will become tricky.

Now let's talk about the evolution:

Initial phase

Large Web sites are made up of small web sites, and so is the site architecture.

Small Web sites do not have too many people to access at the beginning, only need one server is more than sufficient, like this:

applications, databases, files and all other resources are on a single server, usually using Linuxphpmysqlapache can complete the entire project deployment, and then buy a domain name, rent a cheap server can start our website tour.

Separation of application services from data Services

With the development of the business, a gradual server has been unable to meet the requirements, then we can separate the application and data

After separation we use three servers: application servers, file servers, and database servers, as follows:

  

The requirements for these three servers are different:

The application server handles a lot of business logic, so it needs to be better faster and more powerful CPUs

The database server requires fast disk retrieval and data caching, so faster hard drives and larger memory are required

The file server needs to store the file resources uploaded by the user, so it requires a larger hard disk storage space

Application and data separation, the responsibilities become more single-minded, the site's performance has been further improved, but as users continue to increase, we need to further optimize the structure of the site.

Improving performance with caching

Web site access follows the 28 law: 80% of business visits are focused on 20% of the data

Therefore, we need to cache this small amount of data to reduce the database access pressure to improve the entire Web site data access speed, improve the database read and write performance

Web site caching can be divided into two types: cached locally on the application server and a remote cache on a dedicated distributed cache server

The local cache accesses faster, but is limited by the application server memory limit, and there is a memory contention;

Remote distributed caching can be clustered, and a server with large memory deployed as a dedicated caching server can theoretically be a cache service that is not limited by memory capacity.

As shown below:

  

With caching, data access pressure is effectively alleviated, but a single application server can handle a limited number of requests, at the peak of the visit, the application server will become the bottleneck of Web site performance.

Improving concurrent processing capabilities of Web sites using Application server clusters

The use of clustering is a Web site to solve high concurrency, massive data problems commonly used means, when you ascend to a certain extent, it should begin to ascend horizontally.

When a server's processing capacity is not enough, instead of replacing a more powerful server, it is better to add a server to share the original server pressure. For a large web site, no matter how powerful the server, can not meet the continuous growth of business needs, more efficient way is to increase the server to share the pressure

For a Web site architecture, if you add a new server to improve load pressure, you can use the same approach to address the flow of business requirements to achieve scalability of the system.

  

The load balancing dispatch server can distribute user requests to any server in the application server cluster, and if more users can add more application servers, the load pressure of the application server will no longer be the performance problem of the website.

Database read-Write separation

After using the cache, most of the operations can be done without database access, but there are still some read operations (cache access misses, cache expiration) and all the write operations need to access the database, when the number of users in the site to reach a certain time, the database load problem comes

At present, most of the database support master-slave hot backup, through the configuration of the master-slave relationship between the two servers, can be a database server data updates synchronized to another, the Web site to use this function, to achieve database read and write separation, so as to further improve the database load pressure

  

Application server in the write operation, access to the main database, the main database through the master-slave replication mechanism to update the data to the database, so that when the application server to read operations, you can access data from the database.

Use reverse proxy and CDN to speed Web site response

The basic principles of CDN and reverse proxy are caching.

CDN is deployed in the network supplier's computer room, when the user requests, will obtain the data from the nearest network supplier room;

The reverse proxy is deployed in the center room, when the user requests to reach the center room, will first access the reverse proxy server, if the reverse proxy Server cache This user requested resources, directly returned to the user.

  

The use of CDN and reverse proxy is to return to the user data as soon as possible, on the one hand, speed up user access, on the other hand, also reduce the pressure on the backend server.

Using Distributed file systems and distributed database systems

With the continuous development of the website business, this time can be like Distributed Application Server, the database system and file system for distributed management

Distributed database is the last means of Web site database splitting, we can generally take business sub-library, according to different business database deployed on different database server

  

Using NoSQL and search engines

Both of these methods rely on the Internet's technical means, the application server through a unified data access module to access various data, thereby reducing the application has multiple data sources of trouble.

Business Split

For large web sites, we can divide and conquer, the entire site's business into different modules, such as large-scale transaction shopping integrity can be divided into home, shops, orders, buyers, respectively, to the different business team to be responsible for.

At the same time we will split a Web site into multiple applications based on module division, each application for individual deployment and maintenance, application through hyperlinks to establish relationships (point to different application addresses), and finally through the same data storage system to form an interconnected complete system.

  

Distributed services

With the business split, the whole system is growing, the application of the overall complexity of the exponential increase, deployment maintenance more and more difficult, and all the application server to connect with the database service, in the case of tens of thousands of server size, the number of these connections is the size of the server, resulting in insufficient resources

At this time, the same business extraction, independent deployment, the reusable business and connection database, etc., as a public service, and the application system only need to access the public service services through distributed services to complete business operations

  

Here, most of the technical problems can be solved, and some real-time synchronization and other specific business problems can be solved through existing technology.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.