"Large Web site Technical Architecture Notes" (iii) high-performance and highly available architectures

Source: Internet
Author: User
Tags http request memory usage message queue website performance
Performance Test Metrics

1. Response time.
2. Number of concurrent. If there is no accurate monitoring for the time being, there can be different estimates of concurrent numbers for the various business models. For our system to have a peak concurrency estimate, there is a rough way to calculate the average number of concurrent requests per second * 3 for an entire day. But it also requires case-by-case.
3. Throughput. More common are QPS (queries per second), HPS (number of HTTP requests per second), and TPS (number of transactions processed per second).
4. Performance counters. This includes system load, number of threads, CPU, memory usage, and so on. You can use top, free, cat/proc/cpuinfo and other commands to view. The system load is defined as the number of threads currently being executed by the CPU/number of bus cycles waiting to be executed by the CPU. When the value is the same as the number of logical CPUs, it is the best state, which represents the maximum utilization of all resources. But some people think it is best when the load is 0.7 times times the number of logical CPUs.
1) System load, task, CPU, Memory usage:

$ top
top-22:50:09 up 1093 days,  6:14,  1 user,  load average:18.18, 14.55, 10.18
tasks:275 total,
  12 running, 261 sleeping,   0 stopped,   2 zombie
Cpu (s):  1.9% us,  8.5% sy, 42.1% ni, 47.3% id,
  
   0.0% wa,  0.0% hi,  0.1% si
Mem:  65878264k Total, 65837688k used,    40576k free,   130476k Buffers
Swap:  1020088k Total,   635080k used,   385008k free, 40273792k cached
  

2) Memory Usage:

$ free
Mem:      65878264   65757048     121216          0     135408   39984028   -/+ buffers/cache: 25637612   40240652
Swap:      1020088     635080     385008

3) Number of logical CPU processors:

$ cat/proc/cpuinfo | grep "Processor" | Wc-l  
12

It is important to note that. The number of processor shall prevail. There may be deviations in CPU cores. To obtain accurate server CPU information, please follow the model_name to the Internet to search. Web front end performance optimization Browser Access Optimizations reduce HTTP requests. Avoid building too many communication links. Combine JS, CSS, and picture files as much as possible. Avoid too many requests. At the same time, the back-end requests for the system are designed as reasonably as possible to avoid too much interaction. Use the browser's cache. HTTP headers set Cache-control and expires.js filenames such as the ability to take time stamps. Update the timestamp as soon as it is updated, otherwise cache, and try to avoid updating a large number of static resources at the same time. Compresses a static resource. CSS placed at the top of the page, JS down the bottom. To make CSS rendering in advance. At the same time avoid the page block JS brought. But case is required. For example, if the page Dom node relies on JS generation, the file location can be changed visually. Reduce cookie transmission. Also let static resources have separate domain names, sending static resource requests without sending cookies. This reduces the transmission cost. Cookies can be obtained through document.cookie. CDN Acceleration

Caches pictures, files, CSS, and script scripts. But the CDN acceleration effect on the PC is better than the mobile side. A survey found that the higher the latency of last-mile, the poorer the relative effectiveness of the CDN (see article Why CDN accelerates "no" effect on mobile clients). Reverse Proxy

Can provide seven-tier load balancing (HTTP request balancing policy), and can provide caching of static resources, request forwarding, prevent network attacks and so on. The more popular have nginx. Application Server Performance optimization Distributed Cache

The first law of website performance optimization: Prioritize using caching to optimize performance. In general, the data stored in the cache is more than 2:1 read and write, and should be the hotspot data. Consider the potential for short-term inconsistencies in the data if the cache is being used, or the performance and resource overhead that may result if the cache is updated in real time. It is necessary to consider that once the cache is invalidated, a large number of requests directly hit DB may bring a service performance avalanche. Therefore, the cache can be clustered for deployment, so as to avoid the loss of too much data caused by service pressure spikes. For hot-spot data, consider a warm-up load for caching. For example, before the peak time, the hot data is stored in advance cache. This improves service performance at peak times. In order to avoid malicious attacks, always query data that does not exist, causing the cache to miss and frequent access to the DB, you can cache non-existent data and periodically clean up. There is also a mechanism for identifying and blocking malicious requests. Distributed caching should be decentralized and centrally managed. Scalability is ensured by non-communication and isomorphism between different instances, and the complexity of the system is reduced. Asynchronous

Anything that can be done later should be done later.

The purpose of peaking is realized by distributed message queue. Solve problems with business-matching technology. Like 12306 in line. Cluster

The adoption of clusters is also a manifestation of service virtualization. To avoid single-point problems while providing higher-availability, high-performance services. code optimization in multi-threading, if it is intensive computing, the number of threads should not exceed the CPU cores. If it is IO processing, the number of Threads =[task Execution time/(Task execution Time-io wait time)] * CPU cores . In addition, we should design objects as stateless objects, more local objects, appropriate lock refinement. for resource reuse. For example, using a singleton pattern, such as a connection pool. Set JVM parameters reasonably to avoid unreasonable full GC. Storage Performance Optimization

The index of the relational database is implemented by B + tree. Many NoSQL databases are stored using the LSM tree. The LSM retains the most newly censored data in memory until the memory fails to drop, then merges with the next level of the disk's LSM tree. So for more write operations, and read operations more is the query recently written data scene, its performance is much higher than B + tree, using HDFS combined with map reduce for mass data storage and analysis. It can automatically carry out concurrent access and redundant backup, with high reliability. It is equal to the ability to implement raid. Highly Available apps set the service to stateless and load-balance to fail over a stateless service. Session for unified cluster Management. Single sign-on control in a cluster via CAs. High-Availability service rating management. Differentiate between high-priority services and low-priority services to accommodate possible service downgrades. Timeout settings. For time-out processing, there should be a corresponding retry or a quick failure strategy to deal with. asynchronous invocation. Service demotion. During peak periods, if a random refusal or shutdown of the service is appropriate, the request to optimize the service with priority resources is achieved. Idempotent design. Repetitive operations do not result in changes in the state of the data, which is called idempotent. In addition to some core writes, transactions, and other operations, the rest is designed to be idempotent, to customize load balancing and failure strategies. Highly available data is backed up by data and failed over. Data backup is divided into cold and hot spares. Failover requires the data to be switched quickly in the event of data failure. Take advantage of caching. Caching has become an integral part of the data layer in large systems. Use distributed storage systems, such as NFS, for data storage. Because of the use of clustered, distributed services, the CAP (consistency, availability, scalability) in the data storage process may have to be a game of some sort to make a final tradeoff. Sometimes, for a and p, it is possible to sacrifice the strong consistency of the data. This needs to be integrated into the business. services for high-availability monitoring

Monitor user behavior and server performance. And according to the monitoring extended system alarm, failure transfer, automatic demotion, etc. to ensure high availability of operations. Summary

Throughout these two chapters, the book's authors hope to convey the idea that technology is for the business, and how to achieve a high-performance, high-availability service in the current business area is the simplest but most efficient one we must pay attention to, and all of this is business.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.