"Large Web site technology architecture--core principles and case studies" reading notes (4)

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Fourth. High-performance architecture of the website

Website performance is an objective indicator, can concretely reflect the response time, throughput and other technical indicators, but also subjective feelings, and feel is a specific participant related to the subtle things, the user's feelings and engineers feel different, different user experience is different.

First, the website performance test

Performance testing is a prerequisite and basis for performance optimization, as well as an inspection and measurement of performance optimization results. There are different standards for website performance in different viewing angles, and there are different optimization methods.

1. Site performance from different perspectives

(1) User angle, website performance is the user in the browser intuitively feel the site response speed or slow. The main optimization means: Optimize the page HTML style, take advantage of the concurrency and asynchronous features of the browser side, adjust the browser cache policy, use CDN Service, reverse proxy and so on.

(2) Developer perspective, the main concern is the application itself and its related subsystem performance, including response delay, system throughput, concurrency processing capacity, system stability and other technical indicators. Main optimization means: Use cache to speed up data reading, use clustering to improve throughput, use asynchronous message to speed up request response and achieve peak shaving, improve program performance by using code optimization method.

(3) operation and maintenance personnel angle, focus on infrastructure performance and resource utilization, such as network operator bandwidth capability, server hardware configuration, data Center network architecture, server and network bandwidth resource utilization. The main optimization means: The construction optimizes backbone network, uses the cost-effective custom server, uses the virtualization technology optimizes the resource utilization and so on.

2. Performance Test indicators

(1) Response time

Refers to the time it takes an app to perform an operation, including the time it takes to start from the request to the last response data received.

(2) Number of concurrent

The system is able to process the number of requests simultaneously, and also reflects the load characteristics of the system.

(3) Throughput

Refers to the number of requests processed by the system within a unit of time, reflecting the overall processing capacity of the system. such as: TPS (transactions per second) is a common quantitative standard for throughput, HPS (number of HTTP requests per second), QPS (number of queries per second), and so on. In the process of increasing the number of system concurrency (the process is accompanied by a gradual increase in server system resource consumption), the system throughput is gradually increased, to reach a limit, as the number of concurrent increase decreases, when the system crashes, the system resources are exhausted, the throughput is zero. In this process, the response time is to maintain a small rise, reached the throughput limit, rapid rise, to the system crash point, the system loses its response.

(4) Performance counter

Some data metrics that describe the performance of a server or operating system. Includes the system Load, the number of objects and threads, memory usage, CPU usage, disk and network IO, and other indicators.

System load, which is the sum of the number of processes currently being executed by the CPU and waiting to be executed by the CPU, is an important indicator of how busy the system is. In the case of multicore CPUs, the perfect situation is that all CPUs are in use and no process is waiting to be processed. So the ideal value for load is the number of CPUs. When the load value is lower than the number of CPUs, the CPU is idle, the resource is wasted, and when the load value is higher than the number of CPUs, it indicates that the process is queued for CPU scheduling, indicating that the system is running out of resources and affecting the execution performance of the application. Use the top command in your Linux system to view.

3. Performance test Methods

Performance test: The system design initial planning performance indicators as the expected goal, the system constantly exert pressure to verify that the system within the acceptable range of resources, can achieve performance bottlenecks.

Load test: The system constantly increase concurrent requests to increase system pressure, until one or more of the system's performance indicators reached a security threshold, such as a certain resource has been saturated, this is to continue to exert pressure on the system, the system can not improve the processing capacity, but will decline.

Stress test: In case of exceeding the safety load, continue to exert pressure on the system until the system crashes or can no longer process any requests to obtain maximum system pressure tolerance.

Stability Test: Tested system under certain hardware, software, network environment conditions, to the system load a certain business pressure, make the system run for a long time, in order to detect the stability of the system. In different production environments, the request pressure at different time points is uneven, the wave characteristics, so in order to better simulate the production environment, stability testing should be uneven to the system pressure.

4. Performance Test Report

The test result report should be able to reflect the regularity of the performance test curve, and the reader can get the important information whether the system performance meets the design goal and business requirements, the maximum load capacity of the system and the maximum pressure tolerance of the system.

5. Performance optimization Strategy

If the performance test results do not meet the design or business requirements, then you need to find system bottlenecks, divide and conquer, and gradually optimize.

(1) Performance analysis

Large Web site structure is complex, the user from the browser to make requests until the database to complete the operation of the transaction, in the middle of a lot of links, if the test or user reports that the site response is slow, there are performance problems, the request must be analyzed in all aspects of the experience, to troubleshoot potential performance bottlenecks, location problems.

Troubleshooting a site's performance bottlenecks and troubleshooting a program's performance bottlenecks is basically the same: check the log of each link in the request processing, analyze which link response time is unreasonable, exceed expectations, and then examine the monitoring data to analyze whether the main factors affecting performance are memory, disk, network, or CPU. Is the code problem or the architecture design unreasonable, or the system resources are indeed insufficient.

(2) Performance optimization

After locating the specific cause of performance problems, we need to optimize the performance, according to the site hierarchy, can be divided into web front-end performance optimization, Application server performance optimization, Storage server performance Optimization 3 categories.

Second, the Web front-end performance optimization

1. Browser Access optimization

(1) Reduce HTTP requests

Each HTTP request needs to establish a communication link for data transmission, and on the server side, each HTTP needs to start a separate thread to process. The overhead of these communications and services is expensive, and reducing the number of HTTP requests can effectively improve access performance.

The main means of reducing http: Merging CSS, merging JavaScript, merging images. Merges the javascript,css required by the browser into a single file so that the browser only needs one request. Pictures can also be merged, multiple images merged into one, if each picture has a different hyperlink, you can use the CSS offset to respond to mouse click action, construct a different URL.

(2) using the browser cache

By setting the properties of Cache-control and expires in the HTTP header, you can set the browser cache, which can be cached for days, even months.

At some point, the static resource file changes need to be applied to the client browser in a timely manner, this can be achieved by changing the file name, that is, updating the JavaScript file is not updating the contents of the JavaScript file, but instead generate a new JS file and update the references in the HTML file.

Web site using browser caching policy when updating static resources, should not adopt the method of batch update, such as the need to update 10 icon files, not the 10 files are updated all at once, but a file gradually updated, and there is a certain interval, so that the user browser suddenly a large number of cache invalidation, centralized update cache, caused the server load surges, network congestion situation.

(3) Enable compression

In the server side of the file compression, the browser side of the file decompression, can effectively reduce the amount of data transmitted by the communication. The compression efficiency of a text file can be up to 80%, so the Html,css,javascript file enables gzip compression to achieve better results. However, compression has a certain amount of pressure on the server and the browser, which should be weighed when the communication bandwidth is good and the server resources are insufficient.

(4) CSS on top of the page, JavaScript placed at the bottom of the page

The browser will render the entire page after downloading the full CSS, so it is best to place the CSS on top of the page and let the browser download it as soon as possible. JavaScript, on the other hand, executes immediately after the JavaScript is loaded, potentially blocking the entire page, causing the page to appear slowly, so JavaScript is best placed at the bottom of the page. But if you need JavaScript for page parsing, it's not appropriate to put it at the bottom.

(5) Reduce cookie transmission

On the one hand, cookies are included in each request and response, and too large a cookie can seriously affect the data transfer, so what data needs to be written to the cookie requires careful consideration to minimize the amount of data transferred in the cookie. On the other hand, for some static resource access, such as css,javascript, sending cookies is meaningless, you can consider static resources using separate domain names, avoid requesting static resources to send cookies, reduce the number of cookie transmission.

2. CDN Acceleration

The nature of the CDN (content distribution network) is still a cache, and the data is cached in the nearest place to the user, allowing the user to get the data at the fastest speed, the so-called network access first hop.

CDN can cache the general static resources, tablets, files, css,javascript scripts, static Web pages, etc., but these files are very frequently accessed, caching them in a CDN can greatly improve the speed of Web page opening.

3. Reverse Proxy

The reverse proxy server is located on one side of the Web site and the proxy Web server receives an HTTP request.

The reverse proxy server also has the role of securing the site, and access requests from the Internet must go through a proxy server, which is equivalent to creating a barrier between the Web server and possible cyber attacks.

The proxy server can also speed up Web requests by configuring the caching feature. When the user first accesses the static content, the static content is cached on the reverse proxy server, so that when other users access the static content, they can return directly from the reverse proxy server, speed up the response time of the Web request and reduce the load pressure on the Web server.

The reverse proxy can also achieve load balancing function, and the application cluster built by load balancing can improve the overall processing ability of the system, and improve the performance of the high-concurrency of the website.

Third, Application server performance optimization

1. Distributed cache

Website performance Optimization The first law: prioritize using caching to optimize performance. Distributed cache refers to the cache deployed in clusters of multiple servers, providing caching services in a clustered manner. Use caching wisely, and notice the points:

(1) Avoid cache frequently modified data: If the cache is stored frequently modified data, there will be data write cache, the application is too late to read the cache, the data has been invalidated, the system burden.

(2) Avoid caching data without hotspots: Due to limited memory space, only the most recently accessed data can be cached, and historical data will be cleared out of cache. If the application access data does not have hot spots, that is, most of the data access is not concentrated on a small amount of data, then the cache is meaningless, because most of the data has not been re-accessed before it has been squeezed cache.

(3) Inconsistent data and dirty read: Set the expiration time of the cached data, once the expiration time, it will be reloaded from the database, the application to tolerate a certain time of inconsistent data.

(4) Cache availability: Caching is for improved data read performance, cache data loss or cache unavailability does not affect application processing-it can fetch data directly from the database. However, when the cache server crashes, the database will be down because it is completely unable to withstand such pressure, resulting in an entire Web site being unavailable. This situation is known as a cache avalanche. Caching should not be used as a reliable source of data at all. With distributed cache server clusters, caching data is distributed across multiple servers in the cluster to improve the availability of the cache to some extent.

(5) Cache warm-up: The cache is stored in the hot spot data, hot data is the cache system using LRU (the most recent unused algorithm) to the continuous access to the data filtering out, the process takes a long time. The newly-started cache system if there is no data, in the process of rebuilding the cached data, the performance of the system and the database load is not good, it is best to load the hot data when the cache system starts, this cache pre-load means is called cache preheating.

(6) Cache penetration: The cache does not save the data, all requests will fall on the database, the database will be a great pressure, or even crash. A simple countermeasure is to cache the nonexistent data (whose value is null).

2. Asynchronous operations

Message Queuing has a good clipping effect-that is, through asynchronous processing, transaction messages generated by a short period of high concurrency are stored in the message queue, thus flattened the peak concurrent transactions. Anything that can be done late should be done late.

3. Using the cluster

In the case of high site concurrency, using load balancing technology to build a server cluster of multiple servers for an application, distribute concurrent access requests to multiple servers, and avoid a single server responding slowly due to excessive load pressure, which makes user requests have better response latency characteristics.

4. Code optimization

(1) Multithreading

Multi-user concurrent access is the basic requirement of the website, the main means to solve the thread safety are as follows: Design object as stateless object, use local object, use lock when accessing resources concurrently;

(2) Resource reuse

When the system is running, it is necessary to minimize the creation and destruction of expensive system resources, such as database connection, network communication connection, thread, complex object, etc. From a programmatic perspective, there are two main modes of resource reuse: Singleton (Singleton) and object pool.

(3) Data structure

Reasonable use of the appropriate data structure in different scenarios, flexible combination of various data structures to improve data reading and writing and computing features can greatly optimize the performance of the program.

(4) Garbage collection

According to the system business characteristics and the object life cycle, the size of young generation and old generation is set reasonably, and the full GC is minimized. In fact, some Web applications can do this without full GC during the entire run.

Four, storage performance optimization

Most of the time, disk is still the most serious bottleneck of the system, and the data stored on disk is the most important asset of the website, and the availability and fault tolerance of the disk are also critical.

1. Mechanical HDD VS solid-State Drive

Mechanical HDD: Fast sequential access, slow random access

Solid-state drives: fast random access

2. B + Tree VS LSM tree

The read and write characteristics of disk have great influence on the choice of storage structure and algorithm.

B + Tree: is an n-fork sort tree that is optimized for disk storage, stored on disk in tree nodes, the node number and disk location where the required data is located, starting from the root, loading it into memory and then continuing to find the data you want. However, since each disk access is random, and traditional mechanical hard disks have poor performance at random data access, each data access requires multiple access to disk to affect data access performance.

LSM tree: When a read operation is required, the search is always started from the in-memory sort tree, and if it is not found, it is searched from the sorting tree order on disk. A data update on the LSM tree does not require disk access, and can be done in memory, much faster than a B + tree. Using the LSM tree can greatly reduce the number of disk accesses and speed up access when data access is primarily written, while read operations focus on recently written data.

3. RAID VS HDFS

RAID (Redundant array of Inexpensive disks) technology is primarily designed to improve disk access latency and enhance disk availability and fault tolerance. RAID technology is widely used in traditional relational databases and file systems, but in the NoSQL, and distributed file systems, where large web sites prefer to use, RAID technology has been neglected.

HDFs (Hadoop Distributed File System): HDFs with the parallel computing framework such as mapreduce for large data processing, you can read and write access to all disks on the entire cluster without RAID support.

Five, Mind map

From: http://www.cnblogs.com/leo_wl/p/3812115.html

650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M00/7B/0F/wKiom1bFnEWDuWQ6AAXcY78QcnI554.jpg "title=" 1.jpg " alt= "Wkiom1bfnewduwq6aaxcy78qcni554.jpg"/>

This article is from the "self-reliance, tenet" blog, please be sure to keep this source http://wangzhichao.blog.51cto.com/2643325/1743107

"Large Web site technology architecture--core principles and case studies" reading notes (4)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More