What should the performance test do? _ Performance

Source: Internet
Author: User
Tags cpu usage

Occasionally saw Ali middleware Dubbo performance test Report, I think this performance test report makes people feel that the performance test people do not understand the performance test, I think this report will take the public ditch, so, want to write this article, do a little popular science.

First of all, the main problems in this test report are as follows:

1) are all averages. To be honest, the average is very unreliable.

2 The response time is not linked to throughput TPS/QPS. It was completely wrong to test the low rate.

3 response time and throughput are not linked to the success rate.

Why the average is not reliable

As for why the average value is not reliable, I believe that when you read the news, you can often see the average wage, average price, average expenditure, and so on, and so on, you know why the average is not reliable. (These are math games, for students in science and engineering, should be born with immunity)

Software performance testing is also the same, the average is not reliable, you can refer to this detailed article "Why averages suck and percentiles are great", I would like to say briefly here.

We know that when the performance test, the results of the test data are not always the same, but there are high and low, if the average will appear such a situation, if, test 10 times, there are 9 times is 1ms, and there are 1 times 1s, then the average data is 100ms, it is obvious that this completely can not respond to performance testing, Perhaps the 1s request is an abnormal value, is a noise, should be removed. So we'll see in some of the judges that we're going to get rid of one of the highest and lowest points and then the average.

In addition, the median (Mean) may be slightly more reliable than the average, the idea of a median is to put a group of data in the order of magnitude, in the middle of a number is called the median of this set of data, which means that at least 50% of the data is below or above this median.

Of course, the most correct statistical approach is to use percentages distribution statistics. That is, the tp–top percentile in English, TP50 means that 50% of the requests are less than a certain value, TP90 indicates that 90% of the request is less than a certain time.

For example: We have a set of data: [10ms, 1s, 200ms, 100ms], we put it from small to large order: [10ms, 100ms, 200ms, 1s], so we know, TP50, is 50% of the request ceil (4*0.5) = 2 time is less than 100ms, TP90 is 90% of the request Ceil (4*0.9) =4 time is less than 1s. So: TP50 is 100ms,tp90 is 1s.

I used to do it in a Reuters financial system response Time performance test requirements are such that 99.9% of requests must be less than 1ms, all the average time must be less than 1ms. Two restrictions on a condition. Why response time (latency) is linked to throughput (Thoughput)

The performance of the system is meaningless if it only looks at throughput and does not look at response time. My system can top 100,000 requests, but the response time has been 5 seconds, such a system is not available, such throughput is meaningless.

We know that when the concurrency (throughput) increases, the system becomes more and more unstable, the response time fluctuates, the response time slows down, and the throughput is increasingly not (as shown in the figure below), including CPU usage. So, when the system becomes unstable, the throughput is meaningless. Throughput is meaningful only when the system is stable.

Therefore, the throughput value must have response time to card. For example: When the TP99 is less than 100ms, the maximum concurrent number that the system can host is 1000QPS. This means that we are constantly testing on different concurrency numbers to find the maximum throughput of the software at its most stable time.

Why response time throughput and success rate are linked

This should not be difficult for us to understand, if the request is not successful, we also do hair performance test. For example, I say my system concurrency can reach 100,000, but the failure rate is

40%, then, this 100,000 of concurrency is a complete joke.

The tolerance of the failure rate of performance testing should be very low. For some critical systems, the number of successful requests must be 100%, not to be ambiguous at all.

How to do performance testing rigorously

In general, performance testing to unify the consideration of such a few factors: thoughput throughput, latency response time, resource utilization (Cpu/mem/io/bandwidth ...) ), success rate, system stability.

The following methods of performance testing are basically the source of self-old club Thomson Reuters, a company that does real-time financial data systems.

One, you have to define a system response time latency, the recommendation is TP99, and the success rate. For example, the definition of Reuters: 99.9% Response time must be within 1ms, the average response time within 1ms, 100% of the request success.

Second, under this response time limit, find the highest throughput. Test data, need to have large and medium-sized data of various sizes, and can be mixed. It is best to use the test data on the production line.

Third, in this throughput do soak test, such as: Using the second step test to obtain the throughput of 7 consecutive days of continuous pressure measurement system. Then collect CPU, memory, hard disk/network IO, and other indicators to see if the system is stable, for example, the CPU is stable, memory use is also stable. So, this value is the performance of the system

Four, find the system's limit value. For example: In the case of a success rate of 100% (regardless of the length of response time), the system can adhere to 10 minutes of throughput.

Five, do burst Test. Perform 5 minutes with the throughput obtained in step two, then perform 1 minutes on the limit obtained in step fourth, then go back to the second step of throughput execution 5 clock, then the fourth step of the permission value for 1 minutes, such as reciprocating for a period of time, such as 2 days. Collect system data: CPU, memory, hard disk/network IO etc., observe their curves and corresponding response time to ensure the system is stable.

Six, low throughput and network packet testing. Sometimes, in the low throughput time, may cause the latency to rise, for example Tcp_nodelay's parameter does not open may cause the latency to rise (see TCP's those things), but the network packet can cause the bandwidth to use the dissatisfaction also can cause the performance not to go, therefore, The performance test also needs to be selected according to the actual situation to test these two scenes.

(note: At Reuters, Reuters will multiply the throughput of the second step by 66.7% to make the system soft alarm line, 80% as the system's hard alarm line, and the limit is only used to carry a burst of peak)

is not very complicated lock. Yes, just because, this is engineering, engineering is a science, science is rigorous.

You are welcome to share your experience and methods of performance testing.

(End of full text)

From:http://coolshell.cn/articles/17381.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.