Reveal the secrets of increasing QPS

Last Update:2017-01-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reveal the secrets of increasing QPS
GuideIn actual development, many people fear that the QPS of the system increases, because they think that the system will be suspended if the QPS is too high. Based on this mentality, they will try to reduce the system's request volume as much as possible, some people may even place a lot of processing in the service for processing. In this way, when an external request is sent, the service will process all the services.

This method reduces the system's Request volume, but does it reduce the system's QPS? Is the system safer or more dangerous?

First, we will introduce the basic concepts.

1. Key Performance Indicators

System Throughput (Throughput)

Throughput refers to the number of requests processed by the system per unit time, reflecting the overall processing capability of the system.

Response time (system Latency)

Average Request Response Time
Generally, the performance of a system is constrained by the system throughput and response time. For example, my system can withstand 1 million of the concurrency, but the latency of the system is more than 2 minutes, so the 1 million load is meaningless. The system latency is very short, but the throughput is very low.
Generally
• The higher the Throughput (Throughput), the worse the system Latency (Latency. Because the Request volume is too large and the system is too busy, the response speed is naturally low.
• The better the system Latency, the higher the supported Throughput. Because Latency is short, the processing speed is fast, so more requests can be processed.
• Concurrency
Number of requests/transactions simultaneously processed by the system.
• QPS (TPS, Query per second/transaction per second)
Concurrency/Response Time
QPS can summarize the system throughput and latency indicators. Therefore, QPS is also one of the most important indicators of the system. But when the system's QPS increases, what will happen to the system, or how can we avoid the harm to the system caused by the increase of QPS?
Next, let's take a look at the main model of the service-oriented system and the consumption of system resources.
2. composition mode of the service-oriented system2.1 Basic Services
A basic service generally includes two types of operations: business logic processing and database read/write.
What system resources will be consumed when a request is sent?
Request occupation of system resources
After a request is sent, the regular request consumes resources of the system, such as CPU (responsible for computing), system memory, and network connections; if our system is based on Java, it also involves JVM resource occupation, JVM heap and stack resources, among which Heap is a more important indicator. If the request needs to interact with the database, the system's database connection pool resources will be consumed during database connection operations. Corresponding to the DB, it will consume DB computing resources, and the most important indicator of DB computing is the DB response time and the number of DB connections.
2.2 integration service
Compared with basic services, such services only depend on other services and do not have their own data.
Request occupation of system resources
In this system, we can regard the dependent service as a DB, but the system's database connection pool resources are no longer consumed during the request process.
2.3 hybrid Service
This system structure is the most commonly used structure. It includes both its own business data and some computing dependencies and other services.
Resource Consumption of hybrid services
This structure integrates the system consumption of the above two structures.
2.4 system resource consumption
System Load

• System CPU utilization

If the CPU usage of the system is very high, it means that our system is a complex computing system. At this time, if the QPS is no longer available, we need to quickly expand the capacity, increase the system throughput by adding machines to share computing.

• System memory

If the CPU usage is normal but the system's QPS is no longer loaded, it means our machine is not busy with computing, but has received restrictions from other resources, such as memory or io. At this time, first check whether the memory is not enough. If the memory is not enough, expand the capacity quickly.
For Java projects, Heap information in JVM is also a direct response to memory, such as the memory proportion in the old Java age and whether Full GC occurs.

• System IO

The system IO is generally opposite to the CPU usage. When the CPU usage is high, the IO usage is not large, and when the IO usage is high, the CPU usage is generally not high.

• Network bandwidth (number of supported network connections)

When the network bandwidth of our system is occupied, it is equivalent to blocking the entrance and exit of the system. At this time, the external demand is not included, and the QPS is naturally not accessible.
In our system, the connection pool is usually used to connect to the database, and the HTTP connection pool is used to initiate services to the dependent system, or the thread pool is used to provide services to other services. Many times, because the system's own connection pool has a limit on the maximum number of connections, the number of connections in the system is exhausted, and other resources in a single system are still normal. In this case, you can increase the number of connections to increase the system throughput. However, this method requires caution because too many connection pools consume system resources faster, and the pressure is transferred to the dependent system.
Depends on system performance

• DB Performance

DB performance is often the root of the system, because once the database encounters a major problem, it may not only lead to a system problem, but may lead to business logic problems for all systems dependent on the database.
In practice, the most common problem encountered during development is that improper SQL leads to low DB read/write performance, such as the SQL statements that do not use indexes, such as the improper lock range of database tables. In addition, if the database's read and write operations have reached its own limits, you can consider changing the machine, changing the system's hard disk, or increasing the number of reading databases. However, the optimization content in this area is very complicated, there will be a special article to discuss later.

• The performance of services depends on others' services. Many of them are not sure about their system performance. If possible, the downstream systems can be urgently resized to solve their own performance problems; however, for your own systems, you can use Quick failure and interface downgrade.

If the above-mentioned system indicators and dependent system indicators are relatively normal, but the system QPS is still unable to load, it indicates that the system has internal problems, such as the system is blocked.
Before system optimization, we need to perform Profile testing and analysis. According to the principle, 20% of the Code consumes 80% of your performance and find the 20% code, you can optimize the 80% performance.
3 common system optimization tips3.1 code optimization

• Call interfaces Asynchronization

When a dependent service is called, multiple time-consuming requests are combined and sent in Asynchronous Parallel mode, which can reduce unnecessary waiting time.

• IO is the most commonly used file io in the asynchronous cache system. It records logs and sets an appropriate log cache when recording logs, and writes log files asynchronously; record logs where necessary to avoid log misuse, which not only puts pressure on io, but also wastes hard disk space. In some extreme situations, the system throughput will decrease significantly because the disk space is exhausted.

For other operations that require file read/write, we recommend that you use an asynchronous method to reduce the possibility of blocking.

• API requests and response do not use excessively large objects

Too large requests and response will increase the network bandwidth pressure, and too large bytes can easily cause data loss.

• Use cache as appropriate

This is the most common Optimization Method in Internet services and will not be detailed here.

• Use threads with caution

Some people say that thread is edevil, because the bottleneck of multithreading lies in the Lock of mutex and synchronization, as well as the thread context switching cost. It is fundamental to minimize or no lock. In addition, when using a thread pool in the system, avoid improper setting of the thread pool mode and quantity limit, and thus become a system bottleneck.
3.2 database Optimization

• Database locks.

In the case of concurrency, the lock has a very high impact on performance. Various isolation levels, row locks, table locks, page locks, read/write locks, transaction locks, and various write-first or read-first mechanisms. The best performance is not locking. Therefore, database/table sharding, redundant data, and consistent transaction processing can effectively improve performance.

• Use Indexes

When reading and writing data, you must check the index usage in the where condition.

• Avoid SQL-level join Operations

The join Operation in SQL is a complicated problem for INDEX OPTIMIZATION. Because Internet projects often change, the indexes for data tables are constantly optimized, if the join clause is used, it is likely that the index cannot be correctly indexed, and the SQL-level index function is very difficult to maintain.

• Partial result set

Add appropriate limit for queries

A. Do not select *, but explicitly specify each field. If there are multiple tables, you must add the table name before the field name, rather than let the engine calculate.

B. Do not use Having because it needs to traverse all records. Poor performance.

C. Try to replace UNION with union all.

D. If there are too many indexes, the insert and delete operations will be slower. However, if most indexes are updated, update is slow. However, if only one index is updated, only one index table is affected.

E. There are a lot of related information about MySQL optimization. We recommend high-performance MySQL (version 2). This book will give you a more in-depth discussion on the high-performance MySQL.

From: http:// OS .51cto.com/art/201609/517341.htm

Address: http://www.linuxprobe.com/service-qps-increases.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Reveal the secrets of increasing QPS

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Reveal the secrets of increasing QPS

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support