Analysis principle:
Specific problem specific analysis (this is due to different application systems, different testing purposes, different PerformanceAttention point)
Find bottlenecks in the following order, from easy to difficult.
Server hardware bottleneck-〉 network bottleneck (for LAN, can not be considered)-〉 server operating system bottleneck ( ParametersConfiguration)-〉 Middleware bottlenecks (parameter configuration, database, WebServer, etc.)-〉 application bottlenecks (SQL statements, database design, business logic, algorithms, etc.)
Note: The above process is not required in every analysis, to determine the depth of the analysis according to the testing purposes and requirements. For some low requirements, we analyze the application system in the future with large load pressure ( Concurrency UserNumber, data volume), the hardware bottleneck of the system is sufficient.
Subsection exclusion is very effective.
Information sources for analysis:
1) According to the error message during the operation of the scene
2) Monitoring metrics data collected according to the test results
A Error message Analysis
Analysis Examples:
1) error:failed to connect to server "Payment.baihe.com″: [10060] Connection
Error: TimeD out Error:server "User.baihe.com″has shut down the connection prematurely
Analysis:
A, the application service dies.
(Small User: A problem on the program.) Issues with the database on the program)
B, application services are not dead
(Application service parameter setting problem)
Example: In many client connections WebLogic The application server is rejected, and there is no error on the server side, it is possible that the Acceptbacklog attribute value of the server element in the WebLogic is set too low. If you receive a connection refused message when you connect, the value should be increased by 25% each time
C, the connection of the database
(1, the performance parameters of the application service may be too small 2, the maximum number of database startup connections (related to hardware memory))
2) error:page download timeout (seconds) has expired
Analysis: May be caused by the following causes
A, the application service parameter setting too big causes the server bottleneck
B, too many pictures on the page
C, in the Program processing table when the check field too much
Two Monitoring Metrics data Analysis
1. Maximum number of concurrent users:
The maximum number of concurrent users that the application system can withstand in the current environment (Hardware environment, network environment, software Environment (parameter configuration)).
In ProgrammeRunning, if a business operation with more than 3 users fails, or the server shutdown, then the current environment, the system can not withstand the current load pressure of concurrent users, then the maximum number of concurrent users is the previous one does not appear this phenomenon of the number of concurrent users.
If the maximum number of concurrent users measured reached the performance requirements, and the server resources in good condition, business operation response time has reached the user requirements, then OK. Otherwise, the reason is further analyzed based on the resource situation of each server and the response time of the business operation.
2. Business Operation Response Time:
The analysis scenario run should start with the average transaction response time graph and the transactional performance summary graph. Using the transactional Performance summary graph, you can determine which transactions are responding too long during scenario execution.
Subdivide transactions and analyze the performance of each page component. See what page components are causing the long transaction response time? Is the problem related to the network or server?
If the server is taking too long, use the appropriate server map to determine the server metrics that are problematic and to pinpoint the cause of server performance degradation. If your network is taking too long, use the Network Monitor graph to identify the network issues that are causing the performance bottleneck
2-5-10 principle: Simply said, when the user can be within 2 seconds to get a response, will feel the system response quickly, when the user in 2-5 seconds to get a response, the system will feel the response speed can be, when the user within 5-10 seconds to get a response, the system will feel slow response, but can accept , and when the user is still unable to get a response after more than 10 seconds, the system sucks, or thinks the system has lost its response, and chooses to leave the Web site or initiate a second request
3. Server Resource monitoring metrics:
Memory:
1) The indicator memory paging rate (Paging rates) in UNIX resource monitoring, if the value is occasionally higher, indicates that the thread is competing for memory at that time. If it continues to be high, then memory can be a bottleneck. It is also possible that the memory access hit ratio is low.
2) in Windows Resource monitoring, if the values of the Processprivate bytes counter and the Processworking set counter continue to rise over a long period of time, and the value of the Memoryavailable bytes counter continues to decrease, There is a good chance of a memory leak.
Memory resources are a symptom of system performance bottlenecks:
Very high page change rate (pageout);
Process enters inactive state;
The number of active times for all disks in the swap area is high;
Can be high global system CPU utilization;
Memory error Not enough (out of errors)
Processor:
1) CPU Utilization (CPU Utilization) in UNIX resource monitoring (same as Windows operating system)
If the value continues to exceed 95%, it indicates that the bottleneck is CPU. Consider adding a processor or swapping it for a faster one. If the server is dedicated to SQL Server, the maximum acceptable limit is 80-85%
The range of reasonable use is 60% to 70%.
2) Windows Resource monitoring, if the systemprocessor Queue length is greater than 2, and processor utilization (Processor time) has been low, there is a processor blocking.
CPU resources are a symptom of system performance bottlenecks:
Very slow response times (slow response time)
CPU idle time is 0 (zero percent idle CPU)
Excessive user consumption CPU time (high percent user CPU)
Excessive CPU time (high percent system CPU)
Long running process queue (large run queue size sustained over time)
Disk I/O:
1) The indicator disk exchange rate in UNIX resource monitoring (similar to the Windows operating system), if the parameter value has been high, indicates a problem with I/O. Consider replacing a faster hard drive system.
2) in Windows Resource monitoring, if the value of disk Time and Avg.Disk Queue length is high and page reads/sec is low, there may be a disk bottle diameter.
I/O resources are a symptom of system performance bottlenecks:
High disk Utilization (utilization)
Too long disk wait queue (large disk Queue Length)
The percentage of waiting disk I/O is too high (large percentage of time waiting for disk I/O)
Too high physical I/O Rate: large physical I/O rates (not sufficient in itself)
Low cache Hit rate (ratio (not sufficient in itself))
Too long to run the process queue, but the CPU is idle (large run queue with idle CPU)
4. Database server:
SQL Server database:
1) SQL Server resource monitoring in indicator Cache CTR (Cache hit Ratio), the higher the value the better. If it lasts below 80%, you should consider increasing the memory.
2) If the full scans/sec (whole table Scan/sec) counter displays a value that is higher than 1 or 2, you should analyze your query to determine whether full table scanning is really required, and whether the SQL query can be optimized.
3) Number of deadlocks/sec (amount of deadlock/sec): Deadlocks are very harmful to the scalability of the application and can lead to a poor user experience. The value of this counter must be 0.
4) lock Requests/sec (Lock request/sec), the value of this counter can be reduced by optimizing the query to reduce the number of reads.
Oracle Database:
1) If free memory is close to 0 and the cache is fast or the data dictionary has a fast hit ratio of less than 0.90, you need to increase the size of the shared_pool_size.
2) If the cache hit ratio of the data is less than 0.90, then the value of the Db_block_buffers parameter (in blocks) needs to be increased.
3) If the log buffer request has a large value, you should increase the value of the Log_buffer parameter.
4) If the memory sort hit ratio is less than 0.95, you should increase the sort_area_size to avoid sorting the disks.
Performance Test Essentials