[System architecture] Technical Summary of large-scale distributed website architecture and system architecture

Source: Internet
Author: User
Tags database sharding

[System architecture] Technical Summary of large-scale distributed website architecture and system architecture

Original article address

This article is a technical summary of learning large-scale distributed website architecture. This section describes a high-performance, highly available, scalable, and Scalable Distributed website and provides an architecture reference. Some are reading notes and some are personal experience summaries. It has good reference value for large-scale distributed website architectures. (If you feel helpful, please help me with the recommendation. Thank you. This blog will gradually launch a series of articles on the architecture, design mode, and architecture mode of large-scale distributed websites. Contact Group: 466097527)

The outline of this sharing is as follows:

1. Features of large websites

  • Many users and extensive distribution
  • High traffic and high concurrency
  • Massive Data and high service availability
  • Poor security environment and vulnerable to Network Attacks
  • Features, fast changes, and frequent releases
  • Growing from small to large
  • User-centric
  • Free service and paid experience

Ii. Architecture Objectives of large websites

  • High Performance: provides a quick access experience.
  • High Availability: the website service is always accessible.
  • Scalability: increase/decrease in hardware to improve/reduce processing capabilities.
  • Security: Provides policies for secure website access, data encryption, and secure storage.
  • Scalability: You can easily add or remove new functions or modules.
  • Agility: responds quickly as needed;

Iii. Architecture of large websites

 

  • Layer: generally divided into the application layer, service layer, data layer, management layer, and analysis layer;
  • Segmentation: generally divided by business/module/function features, for example, the application layer is divided into homepage and user center.
  • Distributed: deploy applications separately (for example, multiple physical machines) and collaborate through remote calls.
  • Cluster: multiple copies of an application, module, or function (for example, multiple hosts) are deployed to provide external access through Server Load balancer.
  • Cache: place data closest to an application or user to accelerate access.
  • Asynchronization: Asynchronization of synchronous operations. The client sends a request and does not wait for the server to respond. After the server completes processing, it notifies the requester by notification or polling. It generally refers to request-response-Notification mode.
  • Redundancy: add copies to improve availability, security, and performance.
  • Security: provides effective solutions for known problems, and establishes detection and defense mechanisms for unknown/potential problems.
  • Automation: You can use machines to automate repetitive tasks without human intervention.
  • Agility: actively accept demand changes and quickly respond to business development needs.

Iv. High-performance architecture

User-centric, providing a quick Web access experience. The main parameters include short response time, large concurrent processing capability, high throughput, and stable performance parameters.

It can be divided into front-end optimization, application layer optimization, code layer optimization, and storage layer optimization.

Front-end optimization: the part before the website business logic;

Browser optimization: reduces the number of Http requests, uses browser cache, enables compression, Css Js location, and Js Asynchronization to reduce Cookie transmission;

CDN acceleration and reverse proxy;

Application Layer optimization: the server that processes WebSite Services. Use cache, asynchronous, cluster

Code optimization: reasonable architecture, multithreading, resource reuse (Object pool, thread pool, etc.), good data structure, JVM tuning, Singleton, Cache, etc;

Storage optimization: cache, solid state disks, optical fiber transmission, optimized read/write, Disk redundancy, distributed storage (HDFS), NOSQL, etc;

V. high-availability architecture

Large websites should be accessible at any time. Provide external services normally. Because of the complexity of large websites, distributed systems, low-cost servers, open-source databases, and operating systems. It is very difficult to ensure high availability, that is, the failure of the website is inevitable.

How to Improve availability is an urgent problem. First, we need to consider availability at the architecture level during planning. In the industry, the availability metrics are usually expressed in a few 9 s. For example, for four 9 (99.99), the unavailable time allowed in a year is 53 minutes.

Different Levels use different policies. Redundancy backup and Failover are generally used to solve high availability problems.

Application Layer: It is generally designed to be stateless. For each request, which server is used for processing has no impact. Generally, Server Load balancer (Session synchronization needs to be solved) is used to achieve high availability.

Service layer: Server Load balancer, hierarchical management, fast failure (timeout setting), asynchronous calling, service degradation, power design, etc.

Data Layer: redundant backup (cold, hot standby [synchronous, asynchronous], warm standby), failure transfer (confirmation, transfer, recovery ). The famous theoretical basis for data high availability is the CAP theory (persistence, availability, data consistency [strong consistency, user consistency, final consistency]).

Vi. Scalable Architecture

Scalability is to increase/decrease the processing capability of the system by adding/Reducing hardware (servers) without changing the original architecture design.

Application Layer: vertical or horizontal splitting of applications. Then, load balancing (DNS, HTTP [reverse proxy], IP, and link layer) is performed for a single function ).

Service layer: similar to the application layer;

Data Layer: Database sharding, table sharding, and NOSQL; common algorithm Hash and consistent Hash.

VII. Scalable Architecture

You can easily Add/Remove function modules and provide code/Module-Level good scalability.

Modularization and componentization: High Cohesion, inner coupling, improved reusability and scalability.

Stable interface: defines a stable interface. When the interface remains unchanged, the internal structure can be changed randomly.

Design Pattern: applies the object-oriented ideology and principles, and uses the design pattern for code-level design.

Message Queue: a modular system that interacts with each other through message queues to decouple dependencies between modules.

Distributed service: Public modules serve other systems to improve reusability and scalability.

VIII. Security Architecture

Provides effective solutions for known problems, and establishes detection and defense mechanisms for unknown/potential problems. For security issues, we must first improve security awareness, establish an effective security mechanism, and ensure security at the policy and organization level. For example, the server password cannot be disclosed, the password is updated every month, and the password cannot be repeated within three times; weekly security scan, etc. Strengthen the Construction of the security system in an institutionalized manner. At the same time, you must pay attention to security-related aspects. Security issues cannot be ignored. Including infrastructure security, application system security, and data security.

Infrastructure Security: hardware procurement, operating system, and network environment security. Generally, you can purchase high-quality products through regular channels, select a secure operating system, fix vulnerabilities in time, and install anti-virus software firewall. Prevent viruses and backdoors. Set firewall policies, establish DDOS defense systems, use attack detection systems, and perform subnet isolation.

Application System Security: when developing a program, you can use the correct method to solve common problems. Prevents cross-site scripting attacks (XSS), injection attacks, Cross-Site Request Forgery (CSRF), error messages, HTML comments, file uploads, and path traversal. You can also use Web application firewall (such as ModSecurity) to scan security vulnerabilities and other measures to enhance application-level security.

Data confidentiality and security: Secure Storage (stored on reliable devices, real-time and scheduled backups) and secure storage (encrypted storage of important information, and selection of suitable personnel for complex storage and detection ), transmission Security (preventing data theft and data tampering );

Common encryption and decryption algorithms (single hash encryption [MD5, SHA], symmetric encryption [DES, 3DES, RC]), and asymmetric encryption [RSA.

9. agility

Website architecture design and O & M management must adapt to changes and provide high scalability and scalability. It is convenient to cope with rapid business development, sudden increases in traffic access and other requirements.

In addition to the architecture elements described above, we also need to introduce the idea of agile management and agile development. Unify services, products, technologies, and O & M to respond quickly as needed.

10. Large architecture examples

 

The above uses a seven-layer logical architecture, the first layer of customer layer, the second layer of front-end optimization layer, the third layer of application layer, the fourth layer of service layer, the fifth layer of data storage layer, and the sixth layer of big data storage layer, layer 7 big data processing layer.

Customer layer: supports PC browsers and mobile apps. The difference is that mobile apps can directly access the proxy server through IP addresses.

Front-end Layer: Use DNS load balancing, local CDN acceleration, and reverse proxy services;

Application Layer: website application cluster; Vertical Split by business, such as product applications and member centers;

Service layer: provides public services, such as user services, order services, and payment services;

Data Layer: supports relational database clusters (supports read/write splitting), NOSQL clusters, Distributed File System clusters, and distributed Cache;

Big Data storage layer: supports log data collection at the application and service layers, structured and semi-structured data collection for relational databases and NOSQL databases;

Big Data Processing Layer: Uses Mapreduce for offline data analysis or Storm real-time data analysis, and saves processed data to relational databases. (In actual use, offline data and real-time data are classified and processed according to business requirements, and stored in different databases for use at the application layer or service layer ).

Large website architecture communication (Architecture home) QQ Group 2: 464527023; public number: itfly8. The architecture-centered Interest Group focuses on the large-scale distributed website architecture, big data, architecture model, and design model. Technology sharing, classic e-book sharing, welcome to join!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.