Analysis of performance Test results

Last Update:2016-05-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Performance Test engineers are basically able to use test tools for load, stress testing, but most of the analysis tools to collect the results of the test is not possible, the following I will personally work in the experience and collected the relevant information compiled, I hope to be able to analyze the test results to help. Analysis principle:

1. Specific problem specific analysis (this is due to different application systems, different testing purposes, different performance concerns)

2. Find bottlenecks in the following order, from easy to difficult.

Server hardware bottleneck-〉 network bottleneck (for LAN, can not be considered)-〉 server operating system bottleneck (parameter configuration)-〉 middleware bottleneck (parameter configuration, database, Web server, etc.)-〉 application bottleneck (SQL statement, database design, business logic, algorithm, etc.)
Note: The above process is not required in every analysis, to determine the depth of the analysis according to the testing purposes and requirements. For some low-demand, we analyze the application system in the future under the load pressure (number of concurrent users, the amount of data), the hardware bottleneck of the system is enough.

3 subsection exclusion is very effective.
Information sources for analysis:
1 depending on the error message during the operation of the scene
2 monitoring metrics data collected based on test results

A Error message Analysis
Analysis Examples:
1 error:failed to connect to server "10.10.10.30:8080″: [10060] Connection
Error:timed out Error:server "10.10.10.30″has shut down the connection prematurely
Analysis:
A, the application service dies.
(Small User: A problem on the program.) Issues with the database on the program)
B, application services are not dead
(Application service parameter setting problem)
Example: In many client connections WebLogic The application server is rejected, and there is no error on the server side, it is possible that the Acceptbacklog attribute value of the server element in the WebLogic is set too low. If you receive a connection refused message when you connect, the value should be increased by 25% each time
C, the connection of the database
(1, the performance parameters of the application service may be too small 2, the maximum number of database startup connections (related to hardware memory))

2 error:page Download timeout (seconds) has expired
Analysis: May be caused by the following causes
A, the application service parameter setting too big causes the server bottleneck
B, too many pictures on the page
C, in the Program processing table when the check field too much

Two Monitoring Metrics data Analysis
1. Maximum number of concurrent users:
The maximum number of concurrent users that the application system can withstand in the current environment (Hardware environment, network environment, software Environment (parameter configuration)).
In a scenario run, if there are more than 3
The user's business operation failed, or the server shutdown, it is in the current environment, the system can not withstand the current load pressure of concurrent users, then the maximum number of concurrent users is the previous one does not appear this phenomenon of the number of concurrent users.
If the maximum number of concurrent users measured reached the performance requirements, and the server resources in good condition, business operation response time has reached the user requirements, then OK. Otherwise, the reason is further analyzed based on the resource situation of each server and the response time of the business operation.

2. Business Operation Response Time:
The analysis scenario run should start with the average transaction response time graph and the transactional performance summary graph. Using the transactional Performance summary graph, you can determine which transactions are responding too long during scenario execution.
Subdivide transactions and analyze the performance of each page component. See what page components are causing the long transaction response time? Is the problem related to the network or server?
If the server is taking too long, use the appropriate server map to determine the server metrics that are problematic and to pinpoint the cause of server performance degradation. If your network is taking too long, use the Network Monitor graph to identify the network issues that are causing the performance bottleneck

3. Server Resource monitoring metrics:

Memory:
1 The indicator memory paging rate (Paging rates) in UNIX resource monitoring, if the value is occasionally higher, indicates that the thread is competing for memory at that time. If it continues to be high, then memory can be a bottleneck. It is also possible that the memory access hit ratio is low.
2 Windows Resource Monitoring, if the values of the Process/private bytes counter and the Process/working set counter continue to rise over a long period of time, and the value of the Memory/available bytes counter continues to decrease, There is a good chance of a memory leak.
Memory resources are a symptom of system performance bottlenecks:
Very high page change rate (pageout);
Process enters inactive state;
The number of active times for all disks in the swap area is high;
Can be high global system CPU utilization;
Memory error Not enough (out of errors)

Processor:
1 UNIX resource monitoring (the same as the Windows operating system) in Metric CPU utilization (CPU utilization), if the value continues to exceed 95%, indicates that the bottleneck is CPU. Consider adding a processor or swapping it for a faster one. If the server is dedicated to SQL Server, the maximum acceptable limit is 60% to 70% for 80-85%
to be used rationally.
2 Windows Resource Monitoring, if the system/processor Queue length is greater than 2 and processor utilization (Processor time) has been low, there is a processor blocking.
CPU resources are a symptom of system performance bottlenecks:
Slow response times (slow response time)
CPU idle time is 0 (zero percent idle CPU)
too high a user consumes CPU time Percent user CPU)
Excessive system CPU time (high percent system CPU)
long running process queue (large run queue size sustained over time)

Disk I/O:
1 UNIX resource monitoring (similar to the Windows operating system), the indicator disk exchange rate, if the parameter value has been high, indicates a problem with I/O. Consider replacing a faster hard drive system.
2 Windows Resource Monitoring, if the value of disk Time and Avg.Disk Queue length is high and page reads/sec is low, there may be a disk bottle diameter.
I/O resources are a symptom of system performance bottlenecks:
High disk Utilization (utilization)
Too long disk wait queue (large disk Queue Length)
The percentage of waiting disk I/O is too high (large percentage of time waiting for disk I/O)
Too high physical I/O Rate: large physical I/O rates (not sufficient in itself)
Low cache Hit rate (ratio (not sufficient in itself))
Too long to run the process queue, but the CPU is idle (large run queue with idle CPU)

4. Database server:
SQL Server database:
1 SQL Server resource monitoring the indicator cache ClickThrough rate (cached hit Ratio), the higher the value the better. If it lasts below 80%, you should consider increasing the memory.
2 If the full scans/sec (whole table Scan/sec) counter displays a value that is higher than 1 or 2, you should analyze your query to determine whether full table scanning is really required, and whether the SQL query can be optimized.
3 Number of Deadlocks/sec (# of Deadlocks/sec): Deadlocks are very harmful to the scalability of the application and can lead to a poor user experience. The value of this counter must be 0.
4 Lock Requests/sec (Lock request/sec), the value of this counter can be reduced by optimizing the query to reduce the number of reads.

Oracle Database:
1 If free memory is close to 0 and the cache is fast or the data dictionary has a fast hit ratio of less than 0.90, you need to increase the size of the shared_pool_size.
Fast Save (Shared SQL area) and data dictionary fast hit ratio:
Select (sum (pins-reloads))/sum (pins) from V$librarycache;
Select (sum (gets-getmisses))/sum (gets) from V$rowcache;
Free Memory: The SELECT * from V$sgastat where name= ' freedom memories ';

2 if the cache hit ratio of the data is less than 0.90, the value of the Db_block_buffers parameter (unit: block) needs to be increased.
Buffer Cache Hit Ratio:
Select Name,value from V$sysstat where name in (' db block gets ',
' Consistent gets ', ' physical reads ');
Hit Ratio = N (physical reads/(DB block gets + consistent gets)

3 If the log buffer request has a large value, you should increase the value of the Log_buffer parameter.
Application of Log buffers:
Select Name,value from v$sysstat where name = ' Redo log space requests ';

4 If the memory sort hit ratio is less than 0.95, you should increase the sort_area_size to avoid sorting the disks.
Memory Sort hit Ratio:
Select Round ((100*b.value)/decode ((A.value+b.value), 0, 1, (A.value+b.value)), 2) from V$sysstat A, V$sysstat b where a.na Me= ' sorts (disk) ' and b.name= ' sorts (memory) '
Note: The above SQL Server and Oracle database analysis, just a few simple, basic analysis, especially the analysis and optimization of Oracle database, is a specialized technology, further analysis can find relevant information.

The result analysis of performance test is the most serious of performance test. In the actual work, because the test results analysis and comparison

Need to have a lot of relevant expertise, so often feel to get the data do not know where to start. This is my Learning performance

It felt awkward and tricky during the test, and I was reading the Web performance test.

Remember, this is just the 4th chapter in the book. Web Application Performance Analysis

Part, put out hope and everybody discuss together:

A: The basics of performance analysis:

1. Several important performance indicators: corresponding time, throughput, throughput rate, TPS (number of transactions processed per second), point

The rate of attack.

2. The bottleneck of the system is divided into two categories: Network and server. Server bottlenecks mainly involve: applications, Web Services

Four aspects of the system, database servers, and operating systems.

3. General, rough method of performance analysis:

When increasing the pressure on the system (or increasing the number of concurrent users), the throughput rate and the change curve of the TPS are broadly consistent, the system

Basic stability; If the pressure increases, the throughput curve increases to a certain extent after the change is slow, even flat, it is likely that

Network bandwidth bottleneck, the same as if the CTR/tps curve changes slowly or flat, indicating that the server began to appear neck.

4. The author proposes the following basic principles of performance analysis, which I strongly agree with:

--from outside and inside, Youbiaojili, layers in depth

Applying this principle, the analysis step can be divided into the following three steps:

The first step: compare the resulting response time and the user's expectations of performance to determine if there is a bottleneck;

Step Two: Compare TN (network response time) and TS (server response time) to determine whether the bottleneck occurs in the network or

the service device;

Step three: Further analysis to determine the response time of the finer components until the root cause of the performance bottleneck is identified.

Second: Take the Web application as an example to see the specific analysis method:

1. User Transaction Analysis:

A. Transaction overview diagram (Transaction Summary): A histogram representation of the success of user transaction execution and

Failed. By analyzing successful and failed data, you can directly determine whether the system is functioning properly. If the number of failed transactions is very great, then

Indicates that a bottleneck occurred in the system or that a problem occurred during the execution of the program.

B. Transaction average response time analysis diagram (Average Transaction Response times): The graph shows the

The average time spent in the execution of a transaction during the run of the test scenario, as well as the various transactions during the test scenario run time

The maximum, minimum, and average values. It can analyze the performance trend of the system. If all transaction response times are basically one

Curve, the system performance is basically stable, otherwise if the average transaction response time slows down, the performance has a downward trend,

Performance degradation may be caused by a memory leak.

C. By transaction number analysis graph per second (Transaction per second, TPS): Displays every

Seconds, the number of passes, failures, and stops for each transaction. It allows the system to determine the actual transaction at any given moment.

Load. As the test progresses, the number of transactions that the application system passes within the unit time is decreasing, indicating that the server appears

Neck.

D. Number of transactions per second (total transactions per Second): shows the scene running

The total number of transactions that passed, failed, and stopped in each second. If the curve is close to the straight line under the same pressure, the performance basically tends to

Stable, if the total amount of transactions passed within the unit time is decreasing, the overall performance decreases. The cause may be a memory leak or a process

The defect in the order.

E. Transactional performance Summary diagram (Transaction performance Summary): Displays all transactions in the scenario

Minimum and maximum average execution time, you can directly determine whether the response time is in line with the customer's requirements (focus on transaction average, maximum

Execution time).

F. Transaction response time and load analysis diagram (Transaction Response times under load): by

The graph shows the relationship between the transaction response time and the number of users at any point, so as to master the system's user concurrency.

can be data.

G. Transaction response Time (percentage) graph (Transaction Response times (percentile)): the

The graph is a comprehensive analysis based on the analysis of the test results. The analysis of the graph should be based on the whole, if the maximum noise of the possible transaction

Should take a long time, but if most transactions have an acceptable response time, the performance of the system is met.

H. Transaction response time distribution graph (Transaction Response times (distribution)): the

The figure shows the number of transactions for different response times during the test. If the system pre-defines the minimum and maximum acceptable to the relevant transaction

Large transaction response time, you can use this graph to determine whether the system performance is within the accepted range.

Analysis to this step, can only be determined that the bottleneck may be out there, to specifically locate the bottleneck needs more in-depth

The analysis. No stickers, looks a little laborious, if you have a better understanding of these graphs, it should be relatively simple.

Analysis of performance Test results

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of performance Test results

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support