[Note] Performance Testing

Source: Internet
Author: User

Copyright Disclaimer: This article is the author's original article. For more information, see the source. Thanks to mengmeng for a living | contributed by Liu Xiaoyang

Performance testing is a seemingly simple task that is more difficult to perform. In my opinion, every time I perform a one-time test, I think there will be different results, and each performance test exposes problems, this is not just about Java. Tomcat is so simple. Simply put, it is impossible to perform performance tests by writing code.

Then, taking advantage of this performance test opportunity, I will reorganize my understanding of Linux, network I/O and other basic skills, and share my experiences on locating performance bottlenecks.

Background:

The purpose of this performance test is to test the performance of a set of interfaces developed using the company's internal RPC framework. The purpose is to accurately obtain the interface performance indicators. Of course, I also want to look at the code from another angle, for example, from the perspective of performance or operating system friendliness, we should first skip the 10 thousand-word environment building process. In fact, performance testing is a very test and careful process, you need to have a clear business logic for the Entire networking environment, calling relationships, and correctly identifying the roles and responsibilities of presses, application servers, database servers, and cache servers, when possible, squeeze its performance limit and clearly understand the commands and operations that may affect its performance (for example, most of the log printing parameters in JVM startup parameters, disable instant compilation and adjust the size and proportion of GC bins ).

Second, you also need to be clear about the core process and business logic of the test program, and try to remove the possible impact of some non-core components, for example, this test shows the performance of the RPC interface and the efficiency of service processing. The distributed cache and database involved are not the focus of this test, you should try to put distributed caching, databases, and other machines and application servers in the same lan network segment or even on the same machine, so as not to directly affect the test results because of their performance. In short, mock removes all non-core and unrelated factors.

Test tool/command

Because the RPC interface is tested (rather than the HTTP interface), the use of LoadRunner and other testing tools cannot meet the requirements, therefore, jmeter is used in the selection of testing tools. This is a 100% Java-implemented performance testing tool. It is used by inheriting its abstract‑amplerclient abstract class and writing test cases in its runtest method to discover the test jar package on the jmeter client, then, configure the number of threads and start performance testing. For more information, see Google.
After jmeter is deployed on the press, multiple threads start to execute the test case, And the use case starts to send a call request to the application server. In this case, the number of concurrent threads should be set to a high value for the cpu Of the press. For example, about 70%, jemeter provides the push function. If necessary, the push function can be used to increase the pressure.

Next, we will take the opportunity of this performance test to introduce several useful commands. After starting, we can use the top command on the application server to observe its CPU utilization, press the number key 1, view the utilization of each CPU in detail. Ideally, we should be able to see a certain amount of pressure at this time.
You can also use the vmstat command, for example, vmstat 1 30 mark, to print statistics every second for 30 seconds.

The printed result is similar:

Generally, R and B indicate the number of running queues and the number of blocking queues respectively. Ideally, R is relatively large, and B is not.
The end of the vmstat command can be seen: http://www.cnblogs.com/ggjucheng/archive/2012/01/05/2312625.html

After paying attention to the CPU, you can also use the iostat command to view the IO situation. The command format is similar to vmstat. iostat 1 30 indicates the same meaning: print the statistical information once per second, print 30 times and exit.

Here I use iostat-DX 1 every second to print disk details can be referred here: http://www.cnblogs.com/peida/archive/2012/12/28/2837345.html

I/O is not frequent. Io is at a very low level, so I/O should not be the bottleneck of this performance test.

After reading vmstat and iostat, if they are not at a very high level, the next one should give priority to the network. Yes, there is also a netstat for Linux, Which is used differently from the first two, but is equally powerful, lists the details of all TCP connections on the application server. Normally, you can check the number of TCP connections on the application server and the connection status to determine whether the application is running normally.
A very useful tool. If you lose it, you will know its functions.

Purpose

The general purpose of the test is:
1. It is used to obtain the interface performance, common TPS, latency, and QPS.
2. Identify the performance bottleneck through performance testing, locate and optimize it.

Therefore, one principle that must be met for performance testing is to either run the press full or make a certain indicator of the Application Server full. The reason for this is that a certain indicator is because, depending on the application, for example, CPU-intensive applications, io-intensive applications, the CPU utilization or I/O utilization should be minimized to a high level during the test.

If one indicator fails to meet the standards, the performance bottleneck will be located.

Bottleneck locating

To locate the bottleneck, you must first know which metrics of the application server have not met expectations, whether the CPU utilization or IO utilization is not high, this can be observed through the three stat tools described above. If you do not see which value is relatively high, you can at least reach the conclusion that "none is high, it may mean that your application is not as efficient as you think. It must be, at some point, fierce competitors or blocking. The program will be inefficient for no reason!

Of course, in addition to the utilization rate, we can see the CPU or I/O's free time. Sometimes we also need to know the specific number, for example, which Io is busy?

For example, in this performance test, I used an ifstat tool, which is not a built-in tool and needs to be manually installed. The installation process is very simple and Google-based.
After the installation is complete, add the ifsta command to the PATH variable to use the ifstat command globally. The effect is roughly as follows:

Eth0 represents the first network card of the operating system. Here we can see that the reading speed of this Nic is 15 MB, and the exit is about 8 m, and the speed of this Nic is:

That is, 1000 Mbit/s. Because of the 8b10b encoding relationship, the theoretical bandwidth after conversion is 100 Mbit/s, and the utilization rate is 15% at. At this time, we can say that the IO utilization rate is not high. Io is not busy.

Similarly, sometimes there is a situation where the CPU, network/disk I/O are not high, and the TPS is not high. In this case, you can start with the program, check whether the program execution is slow for the following reasons:
1. something with code in serial mode, such as logging (this time I met the relationship between the execution time using a self-written tool to monitor the method, during performance testing, a large amount of log printing and computing work occurs, resulting in a very low TPS of the application. After the tool is removed, the TPS is restored to a normal level. Of course, this is not to say that my tool is not easy to use (Laugh), interested can see the introduction of the tool, in general is still very useful, tool address: https://github.com/liuinsect/Profiler ).
2. multi-thread competition, such as lock competition, object monitor competition, database and distributed cache connection competition (if the connection pool is used) in a multi-thread environment.
3. Execution of too long and complex methods. Although JIT can be optimized to a certain extent, there is no lower limit for poor code. It is always possible to write poorly.
4. The dependent systems are slow, such as databases and distributed cache (of course, there are more possibilities for their slow speed, and sometimes you even need to find out the cause of their slow speed and mock them off ).

At this time, we need to use tools such as Visual VM.

Many times we want to connect to the JVM on the remote server locally. This involves the remote connection to the JVM. You can do this:

1. You must first start the jstatd background process on the remote machine. It is located in the bin directory of the JDK installation path. Configure Java security access and create the file jstatd. All. Policy under the directory where jstatd is located. On my machine, it is/usr/Java/jdk1.7.0 _ 05/bin.

1 grant codebase “file:${java.home}/../lib/tools.jar” {
2  
3 permission java.security.AllPermission;
4  
5 };

Note that there is a semicolon at the end.

2. Run the following command to start jstatd:

Jstatd-J-Djava.security.policy = jstatd. All. Policy

There is no output at normal startup. The default port is 1099. You can also set the port through the-p parameter.

3. Open visual VM locally and enter the IP address of the remote server to connect.

Of course, because jstatd is only started, thread monitoring and CPU monitoring cannot be used through visual. To use these functions, you also need to use JMX to remotely connect to Tomcat. Method here: http://www.oschina.net/question/162973_105064

We can see GC through visual VM. The number of running threads is block, running, and so on. If the application competition is fierce, it should be seen that there is a section of red on the thread running bar, which indicates that the thread is blocked frequently. If you want to go deeper, you can dump the thread situation through visual VM (of course, you can also use jstack-M <pid> On the Application Server to check which objects are blocked by the thread and what the thread call stack looks like, which code segment of the application is frequently competing, why it is competing, and whether it can be optimized.

Of course, in addition to visual Vm, you can also use another tool on Linux, perf, which is a built-in Linux performance analysis tool. It can be used to view the performance of the system, which is powerful, for example:
1. List the hit rates of L1, L2, and L3 caches (hardware support is required. This time, our performance testing machine does not support it)
2. List the performance statistics of the system/process. Command: perf top-P PID
3. analyze the overall performance of the program perf stat
And so on, specific can refer to here: http://iamzhongyong.iteye.com/blog/1908118
There is also a small story about perf. When testing a method of an interface, we use perf to observe that the JVM reflection-related method consumes a lot of time, this reminds me that the reflection-related code does not cache the reflection results. Therefore, after the result is decisively added back to the cache, we can see that the previous method disappears from the perf list.

Summary:

Through the performance testing ideas and tools described above, we can basically complete one-time testing and performance locating for some problems, but performance problems are often hidden, it is also affected by various conditions, such as configuration parameters, network conditions, machine conditions, and performance testing tools in various stages. Therefore, the performance test results cannot be compared independently from a specific environment. Different configurations, environments, and applications may have different performance results. When a problem occurs, we also need, from top to bottom, carefully analyze the implementation of each process and gradually help locate the bottleneck through tools. In short, performance testing is a test of patience, carefulness, knowledge breadth, and depth. Each time you encounter a problem, you need to ask a few more questions, perform analysis and verification several times, try to solve the problem, and optimize it, it will certainly give you a better understanding of the system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.