Architecture model of large Web site architecture

Last Update:2018-03-20 Source: Internet

Author: User

Tags failover website performance

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The last section of the "large-scale Web site Architecture Evolution", today to talk about the structure of the model, what is the model? Each pattern describes the core of a problem and problem solution that repeats itself around us, so that you can reuse the program over and over again without having to do repetitive work, and the key to visible mode is repeatability.

The goal of the Web site architecture model: Facing high concurrent access, massive data processing, high reliability operation and other problems and challenges, we put forward many solutions in practice, mainly in order to realize the high performance, high availability, scalability, scalability, security and other architectural goals of the website.

The following are the specific scenarios for the site architecture model:

1. Layering

Layering is a common architectural pattern that divides the system into several parts in a horizontal dimension, each of which is responsible for a single responsibility, and then completes the entire system work through the upper layer dependency and invocation of the underlying. The general large-scale website system is divided into the following 3 layers:

Application layer: Responsible for specific business and view display;
Service layer: Provide service support for application layer;
Data layer: Provide data storage access services;

The challenge of layered architecture: we must rationally plan the level boundary and interface;

Hierarchical schema constraints: Prohibit cross-level calls and reverse calls (the data layer does not allow the service layer to be called, the service layer is not allowed to invoke the application layer)

2. Segmentation

Layering is horizontal segmentation, segmentation is vertical segmentation, the different functions and services are separated, packaged into a high-cohesion low-coupling module unit, the advantage is:

Contribute to software development and maintenance;
It facilitates the distributed deployment of different modules, and improves the concurrent processing ability and function expansion ability of the website;

3. Distributed

For large sites, the purpose of tiering and segmentation is to facilitate distributed deployment, the different modules deployed on different servers, through remote calls to work together, distributed means that we can use more computers to complete the same task, more computer physics, CPU, memory, storage resources, the more The amount of concurrent access and data that can be handled is even greater, but the distribution also poses some problems:

Distributed services means that calls over the network can have a performance impact;
The more servers, the greater the probability of server downtime, so that the usability of the site is reduced;
It is difficult to maintain data consistency in distributed environment.
The transaction under the distributed is also difficult to guarantee;
Distributed management has increased the difficulty of development and maintenance, remember not to distributed and distributed;

Several common scenarios for distribution:

Distributed Applications and services: the distributed deployment of layered and segmented applications and service modules can improve website performance and concurrency, speed development and release, reduce database connection resource consumption, and enable different applications to reuse common services and facilitate business expansion;
Distributed static resources: static resources such as JS,CSS, pictures and other resources are distributed independently, and adopt independent domain name, i.e. static and dynamic separation; distributed deployment of the resources can alleviate the load pressure of the application server, and speed up the concurrent loading of the browser by the domain name independently;
Distributed data and storage: large Web sites need to process massive amounts of data in P, and a single computer cannot provide such a large amount of storage space, which requires distributed storage;
Distributed computing: At present, the site is widely used in the Hadoop and MapReduce distributed computing framework for batch computing, which is characterized by mobile computing rather than moving data, the calculation program distributed to the location of the data in order to accelerate computing and distributed computing;
Distributed configuration: Web site on-line server configuration updates in real time;
Distributed Lock: Concurrency and collaboration under distributed environment;
Distributed file: A distributed file system that supports cloud storage;

4. Cluster

For the module in the user access set, we also need to consider clustering it, multiple servers deployed the same application to form a cluster, through the load balancer to distribute the request to different servers in the cluster processing. Cluster mode can be extended well, when more users access, only need to add a new server to the cluster to join the cluster, and because one application is serviced by multiple servers, when a server fails, the load balancer or the system's failover mechanism will forward the request to other servers in the cluster, So when we configure the cluster, we need at least 2 or more servers to form a cluster in order to provide the availability of the system.

5. Cache

The data is stored at the nearest location of the distance calculation to speed up processing. Large Web site architecture design typically uses a cache design in the following ways:

CDN: That is, the content distribution network, deployed in the closest to the end user Network service provider, the user network requests always arrive first to his network service provider there, caches some static resources here, can return the resource to the user with the quickest speed;
Reverse proxy: Belong to the front-end part of the site, deployed in the front of the site, when the user requests to reach the site, the first access to the reverse proxy server, where the static resources cache site, do not need to continue to forward the request to the application server can be directly returned to the user;
Local cache: The application server caches some hotspot data locally (data that is frequently accessed over a period of time), and applications can access the data directly in native memory without having to access the database;
Distributed cache: Caches data in a dedicated distributed cache cluster, where the application obtains cached data through network traffic;

There are 2 prerequisites for using a cache:

Data access hotspot imbalance, some data will be more frequent access, this part of the data should be put into the cache;
Data is valid for a certain period of time, and will not expire quickly, otherwise the data of cache invalidation will be dirty read because of invalidation, and affect the correctness of the result;

Benefits of using caching: faster data access and reduced load pressure on back-end applications and data storage;

6. Asynchronous

An important goal of large-scale web site is to reduce the coupling of software, the means of system decoupling in addition to the above mentioned stratification, segmentation and distribution, there is an asynchronous, the service between the message is not synchronous call, but the business operations into several stages, each phase through the sharing of data between the asynchronous collaboration;

In a single server can be asynchronous through multi-threaded shared memory queue, the pre-business thread writes data to the queue, and subsequent threads read data from the queue for processing;
In distributed systems, multiple server clusters are implemented asynchronously through distributed Message Queuing, and distributed Message Queuing can be considered as a distributed deployment of memory queues.

Asynchronous schemas are typical producer and consumer patterns, and asynchronous Message Queuing has the following features:

Improve system availability: When the consumer server goes down, the data accumulates in the message queue, the producer server can continue to process the business request, does not affect the overall operation of the system, when the consumer server resumed normal, can continue to process the data in the message queue;
Speed up website response: After processing the business request, the producer server in the front of the business process can write the data to the message queue without waiting for the result to return directly, and reduce the response delay;
Eliminate concurrent access spikes: Use Message Queuing to put bursts of peak access request data into the message queue, waiting for consumers to process in turn, will not cause too much pressure on the entire site load;

7. Redundancy

Web site needs 24 hours to provide services to users, want to ensure that the server down in the case, do not affect the operation of the site, do not lose data, you need to a certain degree of server redundancy operation, data redundancy backup, so that when a server down, you can transfer the above services and data access to other redundant servers.

Database in addition to regular backup, archiving and storage, to achieve cold backup, in order to ensure high availability of online services, but also need to master-slave separation of the database, real-time synchronous implementation of hot backup.

In order to resist some non-human acts of natural disasters, it is generally necessary to backup the entire Web site data center, the global deployment of disaster preparedness data centers, website programs and data in real-time synchronization to a number of disaster preparedness centers.

8. Automation

Mainly including automated code management, automated testing, automated security detection, automated deployment, such as the implementation of the release process automation, but also the need for automated monitoring of the server, automatic alarm, automatic failover (to isolate the failed server from the cluster), Automated fail-over recovery (synchronizing data to ensure data consistency after restarting the service), automated demotion (reducing system load to a secure level by rejecting partial requests and shutting down some unimportant services), and automating allocation of resources (allocating idle resources to critical services, scaling up deployment).

9. Security

Mainly from the following considerations

Authentication by the password phone check code;
Login, trade and other operations to encrypt the network communication;
To prevent the misuse of network resources by bots to attack websites, using verification code to identify;
The common XSS attacks, SQL injection encoding and conversion processing;
to spam information. filtering of sensitive information;
Carry on the risk control according to the transaction mode and the transaction information for the important operation such as transaction transfer;

10. Summary

Architecture model of large Web site architecture

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More