Architectural evolution of the website architecture

Source: Internet
Author: User
Tags cassandra server memory varnish couchdb

Site from the beginning of the construction of very few people, the number of users, low concurrency, to the next has tens of millions of users, tens of thousands of levels of high concurrency, between the experience of how the process, small Web site architecture is how to evolve, this article briefly discusses the content of this aspect, the main reference "large Web site architecture design", The knowledge points of this book are still relatively comprehensive.

1. Initial stage

Site start is not too much traffic, just a server is more than enough, applications, databases, static resources, etc. are all on a server, generally use LAMP/LNMP (linux+apache/nginx+mysql+php/ Python, and so on) can implement its own web site, the specific architecture is as follows:

2. Separation of application services from data Services

With the development of website business, the increase of user access, the growth of storage data, the single server can not meet the demand, need to separate application services and data Services, as shown in the following:

Due to the different services provided, each server has different requirements for hardware resources, as follows:

Table of resource requirements for different services
Server type Processing business Resource requirements
Application Server Process All business logic Faster, more CPUs
File server Store the file resources that users need to upload files or services themselves Larger disk space
Database server Doing data caching and retrieving data Larger memory and faster disks

3. Caching

With the increasing number of users, the database pressure is too large, resulting in access delays, affecting the user experience, and site performance optimization is the highest priority is the cache ;

The 28 laws followed by the site's access characteristics : 80% of the business access is concentrated on 20% of the data;

Web site using the cache can be divided into Application Server local cache and remote distributed cache, remote distributed cache can generally be deployed in a clustered manner, the server memory is relatively high requirements, as shown in the following:

4. Application Server cluster deployment

With the increase of traffic, the single application server has been unable to cope with more and more requests, and the single server hardware resource is stronger, and it will not meet the load pressure at the peak of business.

Web site to solve high concurrency, massive data problems the most commonly used means or use clusters , to do horizontal expansion, the cluster can be well to meet the scalability ;

Load Balancer Server implementation can have a lot of scenarios, LVS,NGINX,F5, and so on, can and HA software, such as heartbeat and keepalived, etc. together with;

Through the application server cluster deployment, using the Load Balancer Scheduler, you can distribute the user's request to any machine in multiple application server clusters, and depending on the amount of user access, it is easy to add and remove servers, each server load is within acceptable range, as shown in the following:

5. Database read/write separation

For the cache in the dead and cache expired data, still need to read from the database, and all write operations also need to access the database, the database pressure will increase as the traffic increases;

can adopt the master-slave hot-standby scheme, realize the read- write separation , such as the master-slave mode of MySQL, when the reading operation is higher, but also can adopt a master many from the way to achieve;

The data access module in the application needs to ensure that the database read-write separation is transparent to the application ;

As shown in the following examples:

6. Accelerating your website with CDN and reverse proxy

China's network environment is complex, users in different regions visit the same website, the speed difference is large, and the site access delay and user churn rate is positively correlated;

The main speed of website access, reduce the load pressure on the backend server is the use of CDN and reverse proxy ;

The rationale for both CDN and reverse proxies is caching :

CDN deployed in the network provider's room, caches some hot-spot static resources of the website, the user requests the website service, from the distance own nearest network provides the opportunity room to obtain the data, such as the video, the picture and so on;

The reverse proxy is deployed in the central room of the website, belonging to the site's front-end architecture, when the user requests to reach the central room, the first access to the reverse proxy server, if the user requested resources cached (static), the direct return;

Reverse proxy more mature open source software: Squid, Varnish, recommended to use Varnish, from the stability, access speed, the number of concurrent connections, Varnish are more powerful point;

Prerequisites for using the cache: 1. Data access hotspots are unbalanced, and some data is frequently accessed; 2. Data is valid for a certain period of time, does not expire quickly, otherwise it may cause the cached data to be invalidated, dirty read, and the result is correct.

As shown in the following examples:

7. Distributed file system and distributed database system

With the increase of business volume, the most commonly used database splitting is by Business Sub-Library , the different business databases are deployed on different servers;

The general distributed database is the last means of the website database splitting, only used when the scale of the single table is very large;

As shown in the following examples:

8. Using NoSQL and search engines

Full-Text search has become an integral part of large web sites, such as Lucene,Solr , etc.

NoSQL storage is more convenient for unformatted data, NoSQL is more suitable for big data calculations, and the more popular NoSQL databases are HBase, MongoDB, CouchDB, Redis, Cassandra, etc.

Different NoSQL databases use different storage methods, such as Redis,memcache, such as using Key/value key-value pairs of storage, mongodb,couchdb and so on by the document storage, a record of all the data are stored in the document, HBase, Cassandra such as the use of column storage;

As shown in the following examples:

9. Split by Business

After the development and expansion of the website, often contains a variety of complex business scenarios, the use of divide-and-conquer means to divide the entire site business into different product lines, the site into a number of different applications, each application independent deployment maintenance, applications can be linked through hyperlinks, Message Queuing, and so on, as shown in:

10. Distributed Services

On the basis of business split above, some of the public services are extracted and deployed independently, such as user management, commodity management, reusable Business Connection database, providing public service, and application system only need to manage user interface;

As shown in the following examples:

Distributed mainly to solve high concurrency problems, but also introduced a number of other issues: 1. Service calls must pass through the network and may have a significant performance impact; 2. The more servers, the greater the probability of failure, a server outage may lead to a chain reaction (snowball effect), resulting in many applications inaccessible, site availability is reduced, design should be avoided; 3. Data consistency in a distributed environment is also difficult, and distributed transactions are difficult to guarantee, which may affect the correctness of the website business and the business process; 4. Cause the website to rely on the complex, development management maintenance difficulty;
Brief summary
Drive website Technology development of the main force is always the development of the website business, the website has evolved gradually, according to the needs of flexible response is the most important, technology is for the business and service, never for technology and technology;

Architectural evolution of the website architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.