Large Web site architecture technology at a glance

Source: Internet
Author: User
Tags website performance

Site System Architecture Hierarchy: Front-end architecture, Application layer architecture, service layer architecture, storage layer architecture, background architecture, data acquisition and monitoring, security architecture, data Center room architecture.

1. Front-end Architecture (browser optimization technology, CDN, static and dynamic separation, stand-alone deployment of resources, picture service, reverse proxy, DNS)
The front end refers to the user's request to reach the site application server before the link, usually does not include the website business logic, not processing dynamic content.

Browser optimization Technology
Not to optimize the browser, but by optimizing the response page, to speed up the loading and display of browser pages, commonly used page cache, merge HTTP reduce the number of requests, using page compression.

Cdn
Content distribution network, deployed in the network operator room, enables users to obtain content through the shortest path by distributing static page content to the nearest CDN server to the user.

Dynamic separation, static resource independent deployment
Static resources, such as JS, CSS, and other files are deployed on specialized server clusters, separated from web App dynamic content services, and using specialized (level two) domain names.

Image Services
The picture does not refer to the Website logo, button icon, etc., these files belong to the static resources mentioned above, should and JS, CSS deployment together. The picture here refers to the user uploaded images, such as product images, user avatars, image services also applicable to the independent deployment of Image server cluster, and use of independent (level two) domain name.

Reverse Proxy
Deploy in the website room, before the application server, static resource server, picture server, provide page cache service.

Dns
Domain Name service, the domain name is resolved to an IP address, DNS can be used to load balance, configuration CDN also need to modify DNS, so that the domain name after the resolution point to the CDN server.

2. Application Layer Architecture (Development framework, page rendering, load balancing, Session management, dynamic page statics, business splitting, virtualized servers)
The application layer is where the main business logic of the site is handled.

Development framework
Website business is changeable, most of the Web site software engineers are working overtime to develop the website business, a good development framework is crucial. A good development framework should be able to separate the focus, so that art, development engineers can be the department of the matter, easy to collaborate. There should also be some security policies built in to protect against web attacks.

Page rendering
The dynamic content and static page templates that are developed and maintained separately are integrated into a complete page that is ultimately displayed to the user.

Load Balancing
Multiple application servers are organized into a cluster that distributes user requests to different servers through load balancing techniques to cope with high concurrent load pressures that occur when large numbers of users are concurrently accessing.

Session Management
In order to achieve a highly available application server cluster, the application server is generally designed to be stateless and does not save the user request context information, but the Web site business usually needs to maintain user session information, need a special mechanism to manage the session, so that the cluster or even across the cluster application server can share the session.

Dynamic page Static
For dynamic pages that are particularly large and updated, they can be statically generated by generating a static page that accelerates user access using static page optimizations such as reverse proxies, CDNs, browser caches, and so on.

Business Split
The complex and large business split up to form a number of smaller products, independent development, deployment, maintenance, in addition to reducing system coupling, but also facilitate database business sub-Library. According to the business to split the relational database, the technical difficulty is relatively small, and the effect is relatively good.

Virtualized servers
Virtualizing a physical server into a polymorphic virtual server makes it easier to architect a highly available cluster of application servers with less resources for concurrent access to lower business.

3. Service Layer Architecture (distributed messaging, distributed services, distributed cache, distributed configuration)
Provide basic services, supply layer calls, complete the website business.

Distributed messaging
Using the Message Queuing mechanism, asynchronous message sending and low-coupling business relation between business and business, business and service are realized.

Distributed services
Provides high-performance, low-coupling, easy-to-reuse, easy-to-manage distributed services that implement a service-oriented architecture (SOA) on a Web site.

Distributed cache
Providing caching services for large-scale hotspot data through a scalable server cluster is an important means of optimizing website performance.

Distributed configuration
The system needs to configure a number of parameters, if these parameters need to be modified, such as distributed cache cluster to join the new cache server, the application client needs to modify the cache server list configuration, and restart the application server. Distributed configuration provides a dynamic push service during system run time, which pushes the configuration changes to the application system without restarting the server.

4. Storage layer Architecture (distributed files, relational databases, NoSQL databases, data synchronization)
Provides persistent storage access and management services for data and files.

Distributed files
Web site online business needs to store most of the files are pictures, Web pages, videos and other relatively small files, but the number of these files is very large, and often continue to increase, the need for scalable design of a better distributed file system.


relational database
Most of the major business is based on relational database development, but the relational database on the cluster scalability of the support table is poor. By increasing the routing capabilities of database access in the application's data access layer, database access is routed to different physical databases based on business configuration, enabling distributed access to relational databases.

NoSQL Database
At present, a variety of NoSQL databases, in memory management, data model, cluster distributed management and other aspects have advantages, but from the Community activity point of view, HBase is undoubtedly the best at present.

Data synchronization
Prior to the maturity of distributed database technologies that support data sharing across the globe, sites with multiple data centers must synchronize data between multiple datacenters to ensure that each datacenter has complete data. In practice, in order to reduce the pressure of the database, the database of things log (or NoSQL write operation log) synchronized to other data centers, according to log data replay, to achieve data synchronization.

5. Backend Architecture (search engine, Data Warehouse, recommender system)
In addition to processing users ' real-time access requests, there are some background non-real-time data analysis to be processed in the website application.

Search engine
Even the search engine inside the website needs to do the data increment update and the whole quantity update, build index and so on. These actions are performed periodically through the backend system.

Data Warehouse
Provide data analysis and data mining services based on offline data.

Recommendation system
Social networking sites and shopping sites provide personalized referral services by tapping into the relationships between people, people and goods, developing potential relationships and shopping interests.

6. Data acquisition in monitoring (browser data acquisition, server business data acquisition, server performance data collection, system monitoring, System alarm)
Monitor website access and system operation, provide support for website operation decision-making and operation and maintenance management.

Browser data collection
The user's behavior is analyzed by embedding the JS script in the Site page to capture users ' browser environment and Operation Records.

Server Business Data acquisition
The server business data includes two kinds, one is collects the user request operation log in the server side record, one is collects the application running period business data, for instance waits for processing the message number and so on.

Server Performance Data acquisition
Capture server performance data, such as system load, memory usage, network card traffic, and more.

System Monitoring
The data collected above is displayed graphically so that operations and operators monitor the health of the site, and this step is only a system monitoring. A more advanced approach is to automate operations based on collected data, automatically handling system anomalies, and absorbing automated controls.

System Alarms
If the collected data exceeds the predetermined normal threshold, such as the system load is too high, the message, text messages, voice calls and other ways to send alarm signals, waiting for the engineer to intervene.

7. Security Architecture (Web attack, data protection)
Protect websites from attacks and sensitive information leaks.

Web attacks
Attacks initiated in the form of HTTP requests, the most harmful are XSS and SQL injection attacks. But as long as the measures are appropriate, both attacks are relatively easy to guard against.

Data protection
Sensitive information encrypts transmission and storage, protecting websites and user assets.

8. Data Center room Architecture (Machine Room architecture, cabinet architecture, server architecture)
Large sites require a server scale of 100,000, and the physical architecture of the engine room needs attention.

Engine Room Architecture
For a large web site with 100,000 servers, each server consumes electricity (including the power of the server itself and the power consumption of air conditioning) about 2000 yuan a year, then the site of the annual computer room electricity needs 200 million yuan. Data center energy consumption is becoming more and more serious, Google, Facebook Choose the data center location when the trend to choose a good cooling, power supply.

Enclosure Architecture
Including cabinet size, network cable layout, indicator specifications, uninterruptible power supply, voltage specifications (48V DC or 220V civil AC) and a series of problems.

Server architecture
Large web site because of the large size of server procurement, most of the use of custom servers to replace the purchase of server machine. According to the application needs of the site, custom hard drives, memory, and even the CPU, while removing unnecessary peripheral interface (display output interface, mouse, keyboard input interface), and make space structure conducive to cooling.

Large Web site architecture technology at a glance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.