Performance test model and evaluation

Source: Internet
Author: User
Performance measurement

Performance is important only when you decide to measure performance. However, it is difficult for some people to determine which measurement value is needed when measuring performance, and even if they have the information on hand, they do not know what to do. As a result, many people began to do their best to obtain all relevant information. This, of course, also leads to excessive system load and some seemingly meaningless information. Under such circumstances, some people gave up measurement and started to optimize the system with their intuition.

Of course we can't do this. Instead, we should systematically and step by step measure it. First, understand why you want to measure performance and what goals you want to achieve through these tasks. If you do not have a goal, there is no way to accomplish it.

Next you need to understand what you are measuring and what they mean. This may require creating a model to track your data. Only by combining data and models can you obtain valuable information. For example, pre-reported weather requires data such as temperature, humidity, and atmospheric pressure. Even if I know each of their values, if I don't have a climate model for the region, I still cannot predict the weather conditions in that region. The original measurement value is not easy to understand, so it is easier to understand the value in a meaningful model.

 

Measurement variable

Let's start by looking at the metrics that our model needs to collect. Some basic things of the system are as follows:

Response time (r) throughput (x) Resource Utilization (u) Service Request (d)

Response timeUsed to measure how long it takes to complete a specific request. It is a very important metric value because it is an index of user experience. Even so, you must make sure that you understand what you are measuring-the system-level response time is totally different from the component-level response time (because the system-level includes variables such as queue time ).

It is also a measure that is not easy to measure because it is easier to change than other measurement values. Therefore, you must understand the distribution of reaction time. If the response time of an application to most of your users is 2 seconds, but the response time for 10% users is 10 seconds, in this case, you must know the distribution of the response time, in order to accurately assess the problem and solve it. This requires measuring their response time and obtaining their standard deviation. Ideally, a histogram is used to display the distribution of the response time.

ThroughputIndicates the transaction volume that the system can execute within a period of time. It is a good index of the system's processing load capacity, and is usually combined with the response time. Because it is not user-centric, it is the most worth considering for non-interactive systems or batch processing.

Resource UtilizationIs used to measure how many specific elements of the system are used. This metric value is an index of the underlying situation of the system, so it is useful for capacity planning. It is also an easy-to-understand metric. Many people usually start with the utilization of the processor and memory, but it is not the most useful measurement of system performance.

We found that more resources are in the request. We useService Request(D) The calculated metric value shows how a specific resource or service is used. Service requests are calculated using the following formula:

D = u/x

This gives us a clear resource utilization rate, which is normalized for throughput.

 

Correlated metric values

The preceding measurements are related to each other. Understanding the relationship between them is an important first step in creating a performance model. To clearly understand them, we can draw these relationships in the form of charts (SEE)

 

Through this figure, we can see something more clearly. We note that the throughput (X) and response time (r) often increase proportionally. In lab settings, or for non-interactive applications, we usually want to achieve the highest throughput. For a user-driven production environment, we usually want to maximize the throughput while keeping most of the request time below or equal to a specific response time. For example, if we keep 95% of the Request Response time equal to or less than 2 seconds, we may want to maximize our throughput. Due to these restrictions, we may find that we can reach the maximum throughput of 100 transactions per second.

Looking at this figure, we found that resource utilization usually controls system behavior. Here is a resource argument that resource competition leads to a sharp decline in throughput and a rise in response time. The amount of reduced throughput is the same as the amount of increased response time, forming a ring chart. The deduction graph causes the service performance to decrease because the system spends most of its time managing resource competition rather than service requests. It is important to create a system performance model to see how these three metric values are in your application. You also want to know: Which measurement value causes the load to soar and the exact value of this measurement value? Knowing the value that causes application load spikes is useful for setting alarm values in product monitoring tools.

 

Metric value and System Model

After determining the measurement value, we are now ready to put the data into our model.

Common software systems generally consist of four main processes: customer request processing, including initiated requests and sessions that may be bound to these requests; request execution management, requests are assigned to the waiting place before the execution thread; applications, including the Code of all programs; underlying service services, including common elements such as JDBC and JMS, there is also a connection between your application and the outside world. Of course, all these elements may run on a platform or virtual machine, such as JVM, while JVM uses operating system resources such as CPU, memory, physical disks, and network connections. Is a J2EE model.

We can now see how to measure the previously discussed metric values on each layer. We need to measure and understand the distribution of response time in each layer, including the customer request and processing layer, application code layer (in the details of components, methods, or statements layer), and services. In the customer request processing layer, throughput is the most important-we must know how much user load our system can handle. In the system, resource utilization is measured by many differences (operating systems, execution threads, services, etc, therefore, we can associate this information and see how different elements in the system affect each other.

 

Overhead of measurement

There is no free lunch in the sky. At any time, when you measure these measurements in real time, you always add load to the system. By understanding the load caused by our measurement, we will be able to make a wise choice between the amount of information we plan to obtain and the load that may increase. The best way to measure these loads is to use the service requests mentioned above for calculation.

When creating an accurate model of system performance, you must understand the load impact during measurement. J2EE applications are composed of Interconnected Systems, and these systems are slightly different during operation. The only way to handle it is to make sure everything you can lock is locked-you don't want your performance measurement to be made up of batch processing tasks, these batch processing tasks will suddenly start on one of your servers. Regular redo benchmarking is also a good option. This will ensure that you are quickly aware of the sudden decline in system performance, and sudden decline in performance will invalidate your data.

 

Test and Analysis

After establishing a test model and clarifying the testing focus of a component, you can use the performance testing and monitoring methods mentioned earlier to obtain test data and then analyze the test data.

 

A small example of I/O Evaluation

Generally, we can easily observe the memory and CPU pressure on the database server. However, there is no intuitive way to determine the I/O pressure. The disk has two important parameters: Seek time and rotational latency. Normal I/O count: ① 1000/(seek time + rotational latency) * 0.75 is normal in this range. When the I/O count reaches 85% or above, I/O bottle strength is basically considered to exist. Theoretically, the random read count of a disk is 125, and the sequential read count is 225. For data files, they are random reads and writes, and log files are sequential reads and writes. Therefore, we recommend that you store the data files in RAID 5, and the log files in raid 10 or raid 1.

Assume the score of the physical disk performance object observed in RAID 5 with four hard disks:

AVG. disk queue length 12 avg. disk SEC/read. 035 avg. disk SEC/write. 045 disk reads/sec 320 disk writes/sec 100 avg. disk queue length, 12/4 = 3. We recommend that the average queue of each disk not exceed 2. AVG. Disk SEC/read should not exceed 11 ~ 15 ms. AVG. Disk SEC/write is generally recommended to be less than 12 ms.

From the above results, we can see that the I/O capability of the disk meets our requirements, because a large number of requests lead to queue waiting, this is probably because your SQL statement causes a large number of table scans. After optimization, if you still cannot meet the requirements, the following formula can help you calculate and use several hard disks to meet the concurrency requirements:

RAID 0 -- I/OS per disk = (reads + writes)/number of disks RAID 1 -- I/OS per disk = [reads + (2 * writes)] /2 RAID 5 -- I/OS per disk = [reads + (4 * writes)] /number of disks raid 10 -- I/OS per disk = [reads + (2 * writes)]/number of disks

The result is: (320 + 400)/4 = 180. Then you can obtain the normal I/O value of the disk according to formula ①. Assuming that the normal I/O count is 125 now, in order to achieve this result: 720/125 = 5.76. That is to say, it takes six disks to meet this requirement.

However, the above disk reads/sec and disk writes/sec are hard to correctly estimate values. Therefore, an average value can be estimated only when the system is busy, which serves as the basis for the calculation formula. The other is that it is difficult for you to obtain the seek time and rotational latency parameter values from the customer, which can only be calculated with the theoretical value of 125.

 

From: http://www.cnblogs.com/argb/p/3448649.html

Performance test model and evaluation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.