Concrete examples teach you how to do loadrunner result analysis

Source: Internet
Author: User
Tags switches first row cpu usage

LoadRunner is the most important and difficult place to understand--the analysis of test results. The rest of the recording and compression test settings for us can be easily mastered through several operations. For Results analysis I used a picture plus text to do an example, I hope we can give you more help by example. This example is mainly about a number of users at the same time to take over the task, testing the system's responsiveness, to determine the system bottlenecks. Customer Request Response Time is 1 people taking over the time within 5S.

  2. System resources:

2.1 Hardware Environment:

CPU: Ben Four 2.8E

Hard drive: 100G

Network environment: 100Mbps

2.2 Software Environment:

Operating system: English Windows XP

Server: Tomcat Service

Browser: IE6.0

System structure: b/S structure

  3. Add Monitoring Resources

The following example adds some of the resource parameters that are most commonly used in our usual tests. There are some special resources that are not being explained here for the time being. I'll add in later.

The 5 most commonly used resources in Mercury LoadRunner analysis.

1. VUser

2. Transactions

3. Web Resources

4. Web Page Breakdown

5. System Resources

Select "Add graph" or "New graph" in analysis to see these resources. There are other resources that do not have data, we do not let it show.

  

If you want to see more resources, you can leave the lower left display only graphs containing data is not selected. Then select the corresponding point "open graph".

Open Analysis The first thing you can look at is the summary. This shows the profiling summary for the test. But we don't need to look at each one carefully. Here's what the part means:

Duration (Duration): Learn about the duration of the test process. Testers themselves have a general familiarity with how much of the system has been done during this period. To determine the duration of the test for the next additional task condition.

Statistics Summary (Statistical Digest): Just a general understanding of the test data, for our specific analysis does not have much effect.

Transaction Summary (transaction summary): Understand the average response time average the unit is seconds.

The rest can not be looked at. It's not very important.

4. Analysis of Rendezvous points

In the recording script we usually use the rendezvous point, so now that we've used the rendezvous point, we need to know when the VUser is assembled at this point and what kind of a released process it is. This time you need to look at the vuser-rendezvous diagram.

  

Figure 1

You can see that around 3 points and 50 of the 30 users are all concentrated in the start point, it lasted 3 points, started releasing users at 7 to 30, 9 points 30 also 18 users, 11 points 10 and 5 users, and the whole process lasted 12 points.

  

Figure 2

Figure 2 above is a comparison of the aggregate point and the average transaction response time.

Note: System LR defaults The two curves are not in the same diagram after opening analysis. This needs to be set up by itself. The steps are as follows:

Click on the map. Right to select the merge graphs. Then in select Graph to merge with, select the graph that will be used for comparison. Figure 3:

  

Figure 3

The darker color in Figure 2 is the average response time, light for the rendezvous point, when the VUser at the rendezvous point 1 minutes after the average response time to render the maximum, the visibility of the user's concurrency on the system's performance is a big test. Next look at the parameter analysis related to the transaction. Look at a picture.

  

Figure 4

This diagram includes the average Transaction Response time and running vuser two data graphs. You can see from the diagram that Vuser_init_transaction (System login) has no effect on the system, VUser reaches 15 When the average transaction response time is significantly higher, that is, the system to achieve the optimal performance of the time to allow 14 users to handle the transaction at the same time, VUser reached 30 after 1 points, the system response time is the largest, then this maximum response time is to postpone 1 minutes to appear, Transaction response time begins to decline after system stability indicates that some users have already performed the operation. You can also see that you want to control the transaction response time within 10S. The maximum number of vuser cannot exceed 2. It seems to be difficult to meet the needs of users.

Do one thing sometimes the superior will ask you how it's done. You'll say half done. So how much time did you spend on this half of it? So we want to know that the percentage of transactions completed within a given time range depends on the figure below (Transaction Response times ( Percentile)

  

The circled place in the picture indicates that 10% of the transaction response time is around 80S. 80S is not a very small number for users, and only 10% of the business, Khan. Do you think the system will perform well?

Not everything in the actual job can be done in a very short time, for those who need time, we have to allocate the appropriate time to deal with, the uneven distribution of time will occur some things take a long time, some things consume shorter, but we know for ourselves. LR also provides us with the ability to understand most of the transaction response time, and to determine how much of this system we have to pay to improve it.

Transaction Response Time (distribution)-Transaction response times (distribution)

Displays the distribution of the time that the transaction is executed in the scenario. If you define the minimum and maximum transactional energy time that you can accept, you can determine whether server performance is within acceptable scope.

  

It is clear that most of the transaction response time is 60-140s. The maximum response time that most customers can accept in the projects I have tested is around 20S. 140S time! Few people will spend so much time waiting for the page to appear!

By looking at the data table above. It's not hard to see that this system is not ideal in this environment. There is a reason for the result of the world, then what causes the systematic performance to be so bad? Let's take a step-by-step analysis.

The reasons for the poor system performance are manifold, we first look at the application. Sometimes I have to admit that LR is really powerful, and that's why I like it. First, look at a page subdivision chart.

  

An application is composed of many components, the overall system performance is not good then we will thoroughly analyze it. The picture shows all the Web pages involved in the entire test. The Web page breakdown displays the download time for each page. Click the lower left corner of the Web page Breakdown expands to see all the properties of CSS stylesheets, JS scripts, JSP pages, and so on, for each page.

Select the page in the Select page to breakdown.

  

See figure.

After selecting Http://192.168.0.135:8888/usertasks in the Select Page to breakdown, you see the two components that belong to it below, and the first row of connection and initial Buffer occupy the entire time , then the time it consumes is here, and we're going to do it from here.

  

  

It is also possible that the client has the longest time in your program. or whatever, it's going to be based on your own test results. Let's take a look at the CPU, memory. The bottleneck analysis method of the hard disk:

First we have to monitor the CPU, memory, the resource situation of the hard disk. The following parameters are provided to provide the basis for analysis.
%processor Time (processor_total): The number of processor hours consumed by the device. If the server is dedicated to SQL Server, the maximum acceptable limit is 80%-85. That is, common CPU usage.

%user Time (processor_total):: Means CPU-consuming database operations, such as sorting, executing aggregate functions, etc. If the value is high, consider adding an index to reduce the value by using simple table joins, and horizontally dividing the large table.

%DPC Time (Processor_total):: The lower the better. In a multiprocessor system, if this value is greater than 50% and processor:% Processor time is very high, adding a NIC may improve performance and provide a network that is not saturated.

%disk Time (Physicaldisk_total): Refers to the percentage of times that the selected disk drive is busy servicing a read or write request. If the three counters are large, then the hard drive is not a bottleneck. If only%disk time is large and the other two are moderate, the hard drive may be a bottleneck. Before logging the counter, run Diskperf-yd in the Windows 2000 Command Line window. If the value lasts more than 80%, it could be a memory leak.

availiable bytes (memory): Number of physical memory. If the value of the available MBytes is small (4 MB or less), the total memory may be insufficient on the computer or the program is not releasing memory.

Context Switch/sec (System): (Instantiating Inetinfo and dllhost processes) if you decide to increase the size of the thread byte pool, you should monitor the three counters (including one above). Increasing the number of threads may increase the number of context switches, so that performance does not rise but decreases. If the context switching values for 10 instances are very high, you should reduce the size of the thread byte pool.

%disk reads/sec (physicaldisk_total): The number of bytes read hard disk per second.

%disk write/sec (physicaldisk_total): The number of bytes written hard disk per second.

Page Faults/sec: A process-generated paging fault is compared to what the system produces to determine the impact of this process on system page failures.

Pages per second: The number of pages retrieved every second. This number should be less than one page per second working set: The most recent memory pages used by the rationale thread reflect the number of pages of memory used by each process. If the server has enough free memory, the page will be left in the working set, and the page will be cleared out of the working set when free memory is less than a specific threshold value.

Avg.Disk Queue Length: The average number of read and write requests (queued for the selected disk at the instance interval). This value should be no more than 1.5~2 times the number of disks. To improve performance, you can increase the disk. Note: A RAID disk actually has more than one disk.

Average disk Read/write Queue Length: The average number of read (write) requests (queues) disk reads/(writes)/s: The amount of disk reads and writes per second on the disk. In addition, it should be less than the maximum capacity of the disk device.

Average disk Sec/read: The average time required to read data on this disk in seconds. Average disk Sec/transfer: Refers to the average amount of time required to write data on this disk in seconds.

Bytes Total/sec: The rate at which bytes are sent and received, including frame characters. To determine whether the network connection speed is a bottleneck, you can compare the value of this counter to the bandwidth of the current Network page READ/SEC: The number of physical database page reads issued per second. This statistic shows the total number of physical page reads across all databases. Because of the high cost of physical I/O, you can minimize overhead by using larger data caching, smart indexing, more efficient querying, or changing database design.

Page write/sec: (Write page/sec) the number of pages written per second of the physical database.

1. Judge the application Problem

If the system due to inefficient application code or the system structure is flawed, resulting in a large number of context switches (context switches/sec display too high number of contextual switches), it will occupy a large number of system resources, if the system throughput and CPU usage is high, And when this phenomenon occurs, the switching level is above 15000, which means that the context switch is too high.

  

From the overall view of the diagram. The context switches/sec changes little, the slope of the throughout curve is higher, and at this time the contextswitches/sec is over 15000. The program still needs further optimization.

  2. Determine CPU bottlenecks

If the queue length shown by processor queue lengths remains unchanged (>=2) and the processor utilization%processortime over 90%, there is likely to be a processor bottleneck. If you find Processor queue Length displays more than 2 queue lengths, but processor utilization has been low, perhaps more to address the problem of processor blocking, where the processor is not a bottleneck.

  

The%processor time average is greater than 95,processor queue Length is greater than 2. You can determine CPU bottlenecks. The CPU is no longer able to meet the needs of the program. Need to expand.

  3. Determine the memory leak problem

Memory issues primarily check the application for memory leaks, and if a memory leak occurs, the values of the Process/private bytes counter and the process/working set counter tend to rise, while avaiable The value of the bytes is reduced. A memory leak should be tested for a long time to study the test of application response when all memory is depleted.

  

You can see that the program does not have a memory leak problem. Memory leaks often occur when the service is running for long periods of time, and memory is depleted because some programs do not release memory. It is also a reminder of the system stability testing concerns.

Attachment:

CPU Information:

processor/% Processor time for processor usage.

You can also choose to monitor processor/% User time and% Privileged time for more information.

The Server Work queues/queue Length counter shows the processor bottleneck. A queue length of longer than 4 indicates a possible processor congestion.

System/processor Queue Length is used for bottleneck detection by using process/% Processor time and process/working Set

process/% Processor The total processor time on each processor for all threads in the duration process.

Hard Drive information:

Physical disk/% Disk time

Physical Disk/avg.disk Queue Length

For example, include Page reads/sec and% Disk time and Avg.Disk Queue Length. If the page read operation rate is very low and the value of% Disk Time and Avg.Disk Queue length is high, there may be a disk bottle path. However, if the queue length increases while the page read rate does not decrease, there is not enough memory.

Physical disk/% Disk time

Physical Disk/avg.disk Queue Length

For example, include Page reads/sec and% Disk time and Avg.Disk Queue Length. If the page read operation rate is very low and the value of% Disk Time and Avg.Disk Queue length is high, there may be a disk bottle path. However, if the queue length increases while the page read rate does not decrease, there is not enough memory.

Observe the value of the processor/interrupts/sec counter, which measures the speed of a service request from an input/output (I/O) device. If the value of this counter is significantly increased and the system activity does not increase accordingly, there is a hardware problem.

Physical Disk/disk reads/sec and Disk writes/sec

Physical disk/current Disk Queue Length

Physical disk/% Disk time

logicaldisk/% Free Spaces

When you test disk performance, log performance data to another disk or computer so that the data does not interfere with the disk you are testing.

Additional counters that may need to be observed include physical Disk/avg.disk Sec/transfer, Avg.diskbytes/transfer, and Disk bytes/sec.

The Avg.Disk Sec/transfer counter reflects the time it took for the disk to complete the request. A higher value indicates that the disk controller has repeatedly retried the disk because of a failure. These failures increase the average disk transfer time. For most disks, the higher average disk transfer time is greater than 0.3 seconds.

You can also view the value of the Avg.Disk Bytes/transfer. A value greater than KB indicates that the disk drive is typically running well, and if the application is accessing the disk, a lower value is generated. For example, an application that randomly accesses a disk increases the average disk sec/transfer time, because random shipping requires increased search time.

Disk BYTES/SEC provides a throughput rate for the disks system.

Determine workload balance to balance the load on a network server, you need to know how busy your server disk drives are. Use the physical disk/%disk time counter, which displays the percentage of drive activity times. If% Disk time is higher (more than 90%), check the physical disk/current disk Queue Length counter to see the number of system requests that are waiting for disk access. The number of waiting I/O requests should be kept to no more than 1.5 to twice times the number of spindles that make up the physical disk.

Although redundant array of inexpensive disks (RAID) devices usually have multiple spindles, most disks have a spindle. The hardware RAID device appears as a physical disk in System Monitor, and a raid device created from the software appears as multiple drives (instances). You can monitor the physical Disk counters for each physical drive (instead of RAID), or you can use the _total instance to monitor data for all computer drives.

Use the current disk Queue Length and% Disk Time counters to detect bottlenecks in the disc subsystem. If the value of current disk Queue Length and% Disk Time is always high, consider upgrading a disk drive or moving some files to another disk or server.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.