The Requests/second in the performance pressure test responds to the Spike problem, requestssecond

Source: Internet
Author: User

The Requests/second in the performance pressure test responds to the Spike problem, requestssecond

Recently, I have been busy with the final sprint of switching to the java project. The coding code, debug, and fixbug in the early stage are all gradually closed, and the performance pressure test before going online.

Although it is not the performance pressure test requirement before the big promotion, but for the sake of security, you need to have a number in your mind.

After all, this conversion to java service is the core public service of the group (mainly the order domain service ). (When we get online, I will summarize the ups and downs .)

If you don't talk much about it, go to the topic.

This stress test mainly focuses on two core order services, order service and ticket query service. The Initial pressure test of the service is not a problem, mainly because of the index and cache of the database.

The ordering service has two core interfaces: pre-order query and order creation. The pre-order query mainly refers to the calculation of the settlement page of the Order's pre-State (not only the settlement page) without placing specific orders, such as various promotions, coupon codes, and calculation of rules for virtual coins.

The creation of order logic is a little complicated, and there are many dependencies on peripheral systems and middleware. Therefore, you need to pay attention to it. At least be aware of it, even if the performance of the downstream service is faulty, it can be optimized during the next major promotion.

(It does not mean that all performance problems need to be optimized in a timely manner, as long as the business volume can be supported to a certain extent, because the performance optimization is endless, You need to grasp the pace .)

Before submitting a horizontal stress test, we need to repeat it first to speed up the stress test efficiency. Due to the tight schedule and objective environmental problems, I removed several dependencies in the service that do not have a stress testing environment. (I will summarize some practices related to stress testing in the next article. I will not proceed here .) Several rounds of stress testing (about 30 minutes .), Eliminate some environment, code, and dependency barriers, submit the horizontal pressure test process, and then go to other things. (There are many strange problems ~ _~, Mybatis pagehelperplugin seems to be a bit of a concurrency problem, but it hasn't been located yet. I don't know whether it is correct or what the situation is. I will continue to investigate the problem and come to the conclusion that I will share it with you .)

1. Pressure Test Report:

2. View server monitoring information:

Java gc:

3. view the database information:

4. Analysis

There is a problem with the figure above. I don't know if you can see it. There is a problem with the network traffic of the app server where I placed the order service. receive and send cannot be connected.

5. troubleshoot

In fact, there is a conclusion at this time that there is no bottleneck on the server, whether it is the application server, DB, or cache. The problem should be in terms of procedures. (From top to bottom, from bottom to top, and from bottom to top).

Start to troubleshoot dependent services. The ordering service mainly depends on products and promotions. Cache is not a problem, because there is a level-1 cache locally, and the cache expiration time is not correct. redis and MySQL in the stress testing environment are on one machine. So there is no problem with the DB, and basically there should be no problems with redis. (This machine is very powerful) I have commented out some of the interfaces that depend on the business party, so there will be no dependency.

I started to suspect about products and promotions, But I tested the two services separately. These two services basically hit cache, And the QPS was close to 18000. Now we have to perform a detailed stress test on these two services.

Unfortunately, there is no clue, and the performance is good.

Check the thread pool problem, check whether there are block threads, print out the threads through jstack, basically XNIO condition wait, there is nothing abnormal. Because other interfaces of the Order Service are quite normal, the thread pool issue should not be large. After the order is placed successfully, a deliberate hold scenario is the logic such as hold virtual currency, card and coupon code. fiexd thread pool (five, set the saturation policy and log output .), The problem is not big either.

Start troubleshooting logs, restful-slow.log, jdbc-slow.log, error logs and so on, a cat... Grep... Wc-l, nothing exception. (Shit started to sweat ...)

I can only make a big move. I started to try code injection, stress testing, and try one by one. I first commented the DB, then the thread pool held the logic, and then sent the message. (Rogue action ...)

6. surfaced

When I try to comment out the logic for sending a message, I find that the problem does not occur. Hopefully. Start reading the Code. There is no logic. spring's RabbitTemplate. convertAndSend method is used. (This is a synchronous method and there is no declaration that it is async .)

/** Send a message */
Template. convertAndSend (messageConfig. getExchangeName (), routingKey, message, amqpMessage-> {

I have reviewed the materials and have no special requirements for use.

By the way, I checked the configuration file and sent messages in the qa environment. I know that the rabbitmq in the stress testing environment was not good at the moment, in addition, we define the process before using queue, so if you want to use it, you need to configure it first before using it. At that time, graph was easy to use and there was no problem after stress testing. After all, the Design Throughput of MQ was high, and TPS was enough for us to use, in addition, I have pressed the qa MQ before.

(Resources are not isolated for some objective reasons. Sometimes the stress testing environment is set up on a temporary basis. Codis is another middleware used in the qa environment, but codis is basically a second-level cache, so the problem is not serious. (You can try again later .)

Create the rabbitmq server account in the qa environment and open the dashboard in the rabbtimq management interface. Focus on this server. (Open the top name, P \ M to view the rabbitmq metrics .)

7. Face hitting

When I was in a meeting, my stress testing brother asked me, and my brother encountered another problem.

8. Summary

Try to isolate the environment as much as possible, which is the biggest headache for troubleshooting environmental problems, but sometimes it cannot be avoided. (In the next stress testing article, we will share the troubleshooting methods and tools for environmental problems)

When you encounter a problem, you must clarify the root cause. Even if you cannot find the root cause, you must limit it to a certain range, such as limiting it to DB and operating systems.

 

Author: Wang qingpei

Source: http://www.cnblogs.com/wangiqngpei557/

The copyright of this article is shared by the author and the blog Park. You are welcome to repost this article. However, you must retain this statement without the author's consent and provide a clear link to the original article on the article page. Otherwise, you will be held legally liable.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.