Performance TestingEngineers can basically master the use of test tools for load,Stress TestingBut most people cannot start with how to analyze the test results collected by the tool.WorkI hope this will help you analyze the test results. Analysis principles:
1. Specific Problem Analysis (this is because of different application systems, different testing purposes, and different performance concerns)
2. Find the bottleneck in the following order, from easy to difficult.
Server hardware bottleneck-> network bottleneck (for LAN, do not consider)-> ServerOperating SystemBottleneck (parameter configuration)-> middleware bottleneck (parameter configuration,Database,WebServers)-> application bottlenecks (SQLStatement, database design, business logic, algorithms, etc.) Note: The above process is not required in every analysis. The depth of the analysis should be determined based on the test Purpose and requirements. For some low requirements, we have analyzed where the hardware bottleneck of the system will be under the heavy load pressure (number of concurrent users and data volume) of the application system in the future.
3. Sources of information that can be effectively analyzed by segmentation Division: 1. Information about error prompts During scenario operation 2. metric data collected based on test results
I. error prompt analysis instance: 1 error: failed to connect to server "10.10.10.30: 8080": [10060] connection error: timed out error: server "10.10.10.30" has shut down the connection prematurely analysis: a. Application Service is dead. (Small users: Program problems. B. application services are not dead (Application Service parameter settings). For example, many clients are rejected to connect to the WebLogic application server, but no errors are displayed on the server, it is possible that the value of the acceptbacklog attribute of the server element in Weblogic is too low. If the connection refused message is received during the connection, the value should be increased, increase by 25% C each time and connect to the database (1. The performance parameter in the application service may be too small 2. the maximum number of connections started by the database (related to the Hardware Memory ))
2 error: page download timeout (120 seconds) has expired analysis: this may be due to the following reasons: a. Application Service parameter setting is too large, leading to server bottleneck B. Too many images on the page C. Check that the field is too large when the program processes the table.
Ii. monitoring index data analysis 1. Maximum number of concurrent users: Maximum number of concurrent users that the application system can withstand in the current environment (hardware environment, network environment, software environment (parameter configuration. In the running of the Scheme, if the business operation fails for more than three users or the server shutdown occurs, it indicates that the current environment, the system cannot withstand the load pressure of the current concurrent users, so the maximum number of concurrent users is the number of concurrent users that did not. If the maximum number of concurrent users reaches the performance requirement, and the resources on each server are in good condition, and the service operation response time also meets the user requirements, then OK. Otherwise, the cause is further analyzed based on the resources of each server and the response time of business operations.
2. Business Operation response time: the analysis of the program running condition should begin with the average transaction response time graph and transaction performance summary graph. Using the transaction performance summary graph, you can determine the transactions that have a long response time during the execution of the scheme. Segments transactions and analyzes the performance of each page component. View which page components are causing the transaction response time to be too long? Is the problem related to the network or server? If the server takes too long, use the corresponding server diagram to identify the problematic server measurement and identify the cause of the server performance degradation. If the network takes too long, use the network monitor to identify the network problems that cause the performance bottleneck.
3. server resource monitoring metrics:
Memory: 1. In UNIX Resource Monitoring, the index page switching rate (paging rate). If this value increases occasionally, it indicates that there was a thread competing for memory. If it continues high, memory may be the bottleneck. It may also be because the memory access hit rate is low. 2WindowsIn resource monitoring, if the value of the process/private bytes counter and the process/working set counter continues to increase for a long time, and the value of the memory/available bytes counter continues to decrease, memory leakage may occur. Symptoms of memory resources becoming the bottleneck of system performance: High pageout rate; inactive process; high number of disk activity in the SWAp area; high CPU utilization of the global system; out of memory errors)
CPU: CPU utilization in 1 UNIX Resource Monitoring (the same for Windows operating systems). If this value continuously exceeds 95%, the bottleneck is the CPU. You can consider adding a processor or changing a faster processor. If the server is dedicated to SQL Server, the maximum acceptable value is 80-85%. The valid range is 60%-70%. 2 In Windows resource monitoring, if the system/processor queue length is greater than 2, and the processor utilization (processor time) remains low, there is a processor congestion. Symptoms of CPU resources becoming the bottleneck of system performance: slow response time (slow response time) Zero CPU idle time (zero percent idle CPU) high CPU usage (high percent user CPU) high CPU usage (high percent system CPU) long running process Queue (large run queue size sustained over time)
Disk I/O: 1 Disk rate in UNIX Resource Monitoring (the same for Windows operating systems). If the value of this parameter remains high, it indicates that I/O is faulty. Consider replacing a faster hard drive system. 2 In Windows resource monitoring, if the disk Time and AVG. Disk queue length values are very high, and the page reads/sec page read speed is very low, there may be disk bottle diameter. I/O resources are a symptom of system performance bottlenecks: High Disk utilization (High Disk utilization) too long disk wait Queue (large disk queue length) the percentage of time spent waiting for disk I/O is too high (large percentage of time waiting for disk I/O) Physical I/O rate is too high: large physical I/O rate (not sufficient in itself) Low cache hit rate (low buffer cache hit ratio (not sufficient in itself) too long running process queue, but the CPU is idle (large run Queue with idle CPU)
4. Database Server: SQL Server database: 1. cache hit ratio in sqlserver resource monitoring. The higher the value, the better. If the duration is lower than 80%, consider increasing the memory. 2 If the full scans/sec (full table scan/second) Counter shows a value higher than 1 or 2, you should analyze your query to determine whether full table scan is required, and whether SQL queries can be optimized. 3 Number of deadlocks/sec (number of deadlocks per second): deadlocks are harmful to the scalability of applications and lead to poor user experience. The counter value must be 0. 4 lock requests/sec (Lock request/second). By optimizing the query, you can reduce the number of reads and the value of this counter.
OracleDatabase: 1 if the free memory is close to 0 and the hit rate of fast database storage or quick data dictionary storage is less than 0.90, you need to increase the size of shared_pool_size. Hit rate of fast memory (shared SQL zone) and fast data dictionary: Select (sum (pins-reloads)/sum (PINs) from V $ librarycache; select (sum (gets-getmisses)/sum (gets) from V $ rowcache; free memory: Select * from V $ sgastat where name = 'free memory ';
2 If the data cache hit rate is less than 0.90, you need to increase the value of the db_block_buffers parameter (unit: block ). Buffer cache hit rate: Select name, value from V $ sysstat where name in ('db block gets', 'consistent gets', 'Physical reads '); hit ratio = 1-(physical reads/(db block gets + consistent gets ))
3 ifLogsThe value of the log_buffer parameter should be increased if the requested buffer value is large. Log buffer application: Select name, value from V $ sysstat where name = 'redo log space requests ';
4. If the memory sorting hit rate is less than 0.95, increase sort_area_size to avoid disk sorting. Memory sort hit rate: Select round (100 * B. value)/decode (. value + B. value), 0, 1, (. value + B. value), 2) from V $ sysstat A, V $ sysstat B where. name = 'sorts (Disk) 'and B. name = 'sorts (memory) 'Note: The preceding SQL Server and Oracle database analysis is only simple and basic analysis, especially the analysis and optimization of Oracle databases. It is a specialized technology, for further analysis, you can find relevant information.
Performance testing result analysis is the top priority of performance testing. In actual work, the analysis of the test results is complex.
Complex and requires a lot of relevant professional knowledge, so I often feel that I don't know where to get the data. This is also meLearningPerformance
I felt awkward and difficult during the test. Therefore, after studying web performance testing practices, I made the following
Note: This is only part of the Web Application Performance Analysis in Chapter 1 of the book.
I hope to discuss it with you:
I. Basic knowledge of performance analysis:
1. Several important performance indicators: corresponding time, throughput, throughput, TPS (number of transactions processed per second), point
Hit rate.
2. There are two types of system bottlenecks: Network and server. Server bottlenecks mainly involve applications and Web services.
Server, database server, and operating system.
3. Conventional and rough Performance Analysis Methods:
When the system pressure is increased (or the number of concurrent users is increased), the throughput is roughly the same as the TPS curve.
It is basically stable. When the pressure increases, the throughput curve increases to a certain extent and then changes slowly or even flat.
Network bandwidth bottleneck occurs. Similarly, if the click rate/TPS curve changes slowly or flat, the server begins to have a neck.
4. I agree with the following basic performance analysis principles:
-- From the outside to the inside, from the table to the inside, layer by layer
The analysis steps can be divided into the following three steps:
Step 1: Compare the response time with the user's expected performance to determine whether a bottleneck exists;
Step 2: Compare TN (Network Response Time) and TS (server response time) to determine whether the bottleneck occurs on the network or on the server.
Server;
Step 3: further analyze and determine the response time of the finer component until the root cause of the performance bottleneck is identified.
Ii. Take Web applications as an example to illustrate the specific analysis methods:
1. User transaction analysis:
A. Transaction summary diagram (transaction summary): displays the success and
Failed. By analyzing the successful and failed data, you can directly determine whether the system is running normally. If many failed transactions exist
It indicates that the system has a bottleneck or the program has a problem during execution.
B. Average Transaction Response Time:
The average time used for transaction execution within one second during the test scenario. It also shows the transactions in the test scenario.
The maximum, minimum, and average values. It can be used to analyze the performance trend of the system. If the response time of all transactions is basically one
Otherwise, if the average Transaction Response Time slows down, the performance will decrease,
The cause of performance degradation may be caused by memory leakage.
C. Transaction per second (TPS)
The number of seconds in which each task passes, fails, and stops. It can be used to determine the actual transactions of the system at any given time point.
Load. If the number of transactions passed by the application system per unit time is decreasing as the test progresses, it indicates that the server has a bottle
Neck.
D. view the total number of transactions per second (total transactions per second): displays
The total number of transactions that pass, fail, and stop each second. If the curve is close to a straight line under the same pressure, the performance basically tends
Stable; if the total number of transactions passed per unit time is less and less, the overall performance will decline. The cause may be Memory leakage or
Defects in sequence.
E. Transaction performance summary graph (transaction performance summary): displays
The minimum and maximum average execution time can be used to directly determine whether the response time meets the customer's requirements (focus on the average and maximum transaction execution time ).
Execution time ).
F. Transaction Response Time and load analysis diagram (Transaction Response Time under load ):
The figure shows the relationship between the transaction response time and the number of users at any time point, so as to master the system's user concurrency.
Data.
G. Transaction Response Time (percentage) graph (Transaction Response Time (percentile ):
The figure is a comprehensive analysis chart based on the test results. This graph should be analyzed from the whole. If the maximum response of a transaction is possible
It takes a long time, but if most transactions have an acceptable response time, the system performance is consistent.
H. Transaction Response time distribution graph (Transaction Response Time (distribution ):
The figure shows the number of transactions with different response times during the test. If the system pre-defines the minimum and maximum acceptable transaction values
If the transaction response time is large, you can use this figure to determine whether the system performance is within the acceptable range.
By analyzing this step, we can only determine where the bottleneck may be. Further exploration is needed to locate the bottleneck.
. Without textures, it looks a bit difficult. If you know more about these images, it should be relatively simple.
Performance Test Result Analysis