(6) Spring Webflux Performance Test--response spring's DAO spell device

Source: Internet
Author: User
Tags dbase jconsole

This series of articles index the "Response Spring's word Wizard"
Previously feed-response Flow | Reactor 3 Quick Start | Spring Webflux Get started quickly
This article source

1.4 Benefits of asynchronous non-blocking from load testing

The front is always "Amway" asynchronous non-blocking benefits, below we will actually feel the performance of responsive programming in high concurrency environment. The benefits of asynchronous non-blocking can be seen in the I/O operation, whether file I/O, network I/O, or database read/write, there may be blocking situations.

Our test content has three:

    1. By first creating a Web service based on WEBMVC and Webflux, to compare the performance gains of asynchronous non-blocking, we simulate a simple scenario with a delay and then start the service using Gatling for testing and analysis;
    2. Since the application of microservices architectures is becoming more and more widespread, our first-step test project is to further observe the test data in the case of invoking deferred services, in fact, mainly for the testing of the client: blocking RestTemplate and non-blocking WebClient ;
    3. Performance testing and analysis for MongoDB's synchronous and asynchronous database drivers.

Description: This section does not carry out rigorous load testing for specific business scenarios based on performance tuning requirements. The test scenario in this section is simple and straightforward, and the friends get to my point.
In addition: Because this section is mainly for horizontal contrast testing, so there is no need for specific hardware resource configuration, but it is recommended to test in the Linux environment , I originally ran on the WIN10, when the number of users came up after a lot of requests failed, The test data below was run on a laptop system for Deepin Linux (Debian).

So let's start building this simple, rough test environment.

1.4.1 Load test analysis with delay

1) Build a project to be tested

We created two projects based on WEBMVC and Webflux respectively: mvc-with-latency and WebFlux-with-latency .

To simulate blocking, we created an API with latency in each of the two projects /hello/{latency} . For example /hello/100 , the response is delayed by 100ms.

mvc-with-latencyCreated in HelloController.java :

    @RestController    public class HelloController {        @GetMapping("/hello/{latency}")        public String hello(@PathVariable long latency) {            try {                TimeUnit.MILLISECONDS.sleep(latency);   // 1            } catch (InterruptedException e) {                return "Error during thread sleep";            }            return "Welcome to reactive world ~";        }    }
    1. Use sleep to simulate a blocking situation in a business scenario.

WebFlux-with-latencyCreated in HelloController.java :

    @RestController    public class HelloController {        @GetMapping("/hello/{latency}")        public Mono<String> hello(@PathVariable int latency) {            return Mono.just("Welcome to reactive world ~")                    .delayElement(Duration.ofMillis(latency)); // 1        }    }
    1. Use delayElement operators to implement delays.

The application.properties port numbers 8091 and 8092 are then configured in each:

server.port=8091

Launch the app.

2) Writing load test scripts

In this section we use Gatling to test. Create a test project gatling-scripts .

Add Gatling dependencies and plugins to the POM (currently gradle does not have this plugin yet, so it can only be a maven project):

    <dependencies>        <dependency>            <groupId>io.gatling.highcharts</groupId>            <artifactId>gatling-charts-highcharts</artifactId>            <version>2.3.0</version>            <scope>test</scope>        </dependency>    </dependencies>    <build>        <plugins>            <plugin>                <groupId>io.gatling</groupId>                <artifactId>gatling-maven-plugin</artifactId>                <version>2.2.4</version>            </plugin>        </plugins>    </build>

src/testunder Create a test class, Gatling uses the Scala language to write test classes:

    import io.gatling.core.scenario.Simulation    import io.gatling.core.Predef._    import io.gatling.http.Predef._    import scala.concurrent.duration._    class LoadSimulation extends Simulation {      // 从系统变量读取 baseUrl、path和模拟的用户数      val baseUrl = System.getProperty("base.url")      val testPath = System.getProperty("test.path")      val sim_users = System.getProperty("sim.users").toInt      val httpConf = http.baseURL(baseUrl)      // 定义模拟的请求,重复30次      val helloRequest = repeat(30) {        // 自定义测试名称        exec(http("hello-with-latency")          // 执行get请求          .get(testPath))          // 模拟用户思考时间,随机1~2秒钟          .pause(1 second, 2 seconds)      }      // 定义模拟的场景      val scn = scenario("hello")        // 该场景执行上边定义的请求        .exec(helloRequest)      // 配置并发用户的数量在30秒内均匀提高至sim_users指定的数量      setUp(scn.inject(rampUsers(sim_users).over(30 seconds)).protocols(httpConf))    }

As above, the scenario for this test is:

    • The specified user volume is increased at a constant rate of 30 seconds;
    • Each user repeats the request 30 times for the specified URL, with a random interval of 1-2 seconds in the middle of the thought time.

Where the URL and user volume pass, and variables are passed in, using the base.url test.path sim.users maven plugin, start the test with the following command:

    mvn gatling:test -Dgatling.simulationClass=test.load.sims.LoadSimulation -Dbase.url=http://localhost:8091/ -Dtest.path=hello/100 -Dsim.users=300

A test that represents a pair with a user volume of 300 http://localhost:8091/hello/100 .

3) Observe the number of threads

Before testing, we opened jconsole to observe the changes in the thread of the application (connection mvcwithlatencyapplication):

(resolution problem is not very good to display) is just starting without any requests come in, the default execution thread has 10, the total number of threads 31-33.

For example, when a test of 2,500 users is performed, the thread of execution increases to 200, the total number of threads peaks to 223, which is the increase of these 190 threads of execution. As follows:

Since the number of execution threads is randomly reduced back to 10 after the load has elapsed, it is not reliable to look at the maximum number of threads to estimate the number of threads, and we can use "peak thread count-23" to get the number of execution threads during the test.

4) Load test

First we test mvc-with-latency :

    • -dbase.url=http://localhost:8091/;
    • -DTEST.PATH=HELLO/100 (delay 100ms);
    • -dsim.users=1000/2000/3000/.../10000.

The test data is as follows (tomcat Max Threads 200, delay 100ms):

The above data shows that:

    1. The number of threads reached the default maximum of 200 when the user volume was close to 3000;
    2. Before the number of threads reaches 200, 95% of the request response length is normal (a little more than 100ms), followed by a straight upward trend;
    3. When the number of threads reaches 200, the throughput increase slows down gradually.

The reason for this is that when all available threads are blocked, subsequent requests for re-entry can only be queued, so the response time starts to rise when the maximum number of threads is reached. We take the 6000 user report as an example:

This graph is a graph of how long the response time of the request will vary, and you can see that it can be roughly divided into five segments:

    • A. There are idle threads available and requests can be returned at 100ms+ time;
    • B. The thread is full, and the new request starts to queue because the A and B phases are the level of user-level rise, so there are more and more queued requests;
    • C. The volume of requests per second stabilizes, but the high response time is maintained for a period of time due to queuing;
    • D. Requests completed by some users, the number of requests per second gradually decreased, the queuing situation gradually eased;
    • E. When the user volume drops to the thread full load and the queue is digested, the request returns at normal time;

The response time for all requests is distributed as shown:

The difference between the length of the a/e segment and the C-segment is the average queue-waiting time. In the case of persistent high concurrency, most requests are in the C segment. And wait for the increase of Gizzard request volume and linear growth.

Increasing the number of threads that the servlet container handles requests can alleviate this problem by increasing the maximum number of threads from the default of 200 by 400, as above.

The number of threads up to 200 is the default setting for Tomcat, and we set it to 400 to test again. application.propertiesadded in:

server.tomcat.max-threads=400

The test data is as follows:

Because the number of worker threads is enlarged by one-fold, the request queuing situation is half-relieved, so you can compare the data:

    1. "The maximum number of threads 200 user 5000" "95% response Time" is exactly the same as "maximum number of threads 400 user 10000", I swear to God, this is absolutely true data, more coincidentally, the throughput is exactly 1:2 of the relationship! This coincidence is also because the test scenario is too simple and rude, haha;
    2. The slope of the "95% Response Time" curve is also twice times the relationship.

This also confirms our analysis above. Increasing the number of threads can indeed improve throughput to a certain extent, reducing the latency of response due to blocking, but at this point we need to weigh some factors:

    • Adding threads is a cost, and by default the JVM allocates a 1M line stacks when creating a new thread, so more threads smell more memory;
    • More threads lead to more thread context switching costs.

Let's take a look WebFlux-with-latency at the test data for:

    • There is no count of the number of threads, because the number of worker threads is always maintained at a fixed number for Webflux applications running on the Netty of asynchronous IO, usually this fixed number equals the number of CPU cores (through Jconsole you can see the known reactor-http-nio-X and parallel-X Thread, I this is a quad-core eight thread of i7, so X from 1-8), because the program logic is event-driven and does not require multithreading concurrency, because of asynchronous non-blocking conditions;
    • With the increase of the number of users, the throughput tends to increase linearly.
    • 95% of the responses are returned within the controllable range of the 100ms+, without delay.

It can be seen that non-blocking processing avoids thread queuing, allowing the processing of a large number of requests to be handled with a small number of fixed threads.

In addition, I one step directly test the situation of 20000 users:

  1. The mvc-with-latency test was failed due to many requests fail;
  2. And WebFlux-with-latency should be 20000 users have batting an eye heart, throughput reached 7228 req/sec (i wipe, just 10000 users under twice times, too clever today how, absolutely true data!) ), the 95% response duration is only 117ms.

Finally, two graphs of throughput and response time are given, and a more intuitive way to feel asynchronous non-blocking Webflux is how to ride the dust:

At this time, we have more understanding of Nodejs's pride, but our big Java language also has vert.x and now the spring Webflux.

In this section we perform server-side performance testing, and the next section continues to analyze the performance of the Spring Webflux Client tool WebClient, which brings a small performance boost to the MicroServices architecture system.

(6) Spring Webflux Performance Test--Response Spring's DAO wizard

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.