Traps of Parallel Streams in Java 8 [translation], java8streams

Source: Internet
Author: User
Tags stream api

Traps of Parallel Streams in Java 8 [translation], java8streams

: Simplified text and free text links: Java Parallel Streams Are Bad for Your Health!
Java 8 provides three important functions that we desire: Lambdas, Stream API, and default Interface methods. However, we can easily abuse them and even destroy our own code.

Today, let's take a look at Stream APIs, especially parallel streams. This article outlines the traps.

But first let's take a look at the reason why Stream api is praised-parallel execution. It uses the default ForkJoinPool to speed up your multi-threaded tasks.

Parallel Streams trap

Here's a classic example of how to use parallel streams (the awesomeness that parallel streams promise you ). In this example, we want to query multiple search engines at the same time and obtain the first returned result.

public static String query(String question) {    List<String> engines = new ArrayList<String>() {{        add("http://www.google.com/?q=");        add("http://duckduckgo.com/?q=");        add("http://www.bing.com/search?q=");    }};    // get element as soon as it is available    Optional<String> result = engines.stream().parallel().map((base) -> {    String url = base + question;    // open connection and fetch the result    return WS.url(url).get();    }).findAny();    return result.get();}

Is it great? But let's dig into what happened behind it. Parallel streams is executed by the parent thread and the default fork join pool of JVM is used:ForkJoinPool.common(). (There is a link on fork join)

However, it is important to note that the query engine is a blocking operation. So all threads callget()And wait for the result to return.

Wait, is that what we wanted at the beginning? We wait for all the results at the same time, instead of traversing the list and waiting for each answer in order.

However, due to the existence of ForkJoinPool workders, parallel waiting is a side effect compared to waiting using the main thread. Now, the ForkJoin pool does not offset the blocked workers by generating new workers. Then allForkJoinPool.common()All threads are used up.

That is to say, the next time you call this query method, it may run at the same time as other parallel streams, resulting in the performance of the second task being greatly impaired.

However, do not rush to figure out the implementation of ForkJoinPool. In different cases, you can give it a ManagedBlocker instance and ensure that it knows when to offset the stuck workers in a blocking call.

It is interesting to note that in parallel stream processing, blocking calls may delay program performance. Any function that is used to map to a set for a long time will produce the same problem.

Let's look at the example below.

long a = IntStream.range(0, 100).mapToLong(x -> {    for (int i = 0; i < 100_000_000; i++) {    System.out.println("X:" + i);  }  return x;}).sum();

This code has encountered the same problem as the network access code above. Each lambda execution is not completed in an instant, and other parts of the program will not be able to access these workers during execution.

This means that any program dependent on parallel streams will become unpredictable and conceal the crisis when anything else occupies the common ForkJoinPool.

What then? I am not the master of my program

Indeed, if you are writing a single-threaded program elsewhere and know exactly when you should use parallel streams, you may think this problem is a little superficial. Then, many of us are dealing with web applications, various frameworks, and heavyweight application services.

How is a server designed as a host that supports multiple independent applications? Who knows? Give you a predictable effect and cannot control the input parallel stream? (Offer you a predictable parallel stream performance if it doesn't control the inputs)

One way is to limit the number of parallel rows provided by ForkJoinPool. You can use-Djava. util. concurrent. ForkJoinPool. common. parallelism = 1 to limit the thread pool size to 1. It is no longer possible to get the benefit from parallelism to prevent errors from being used. (It makes sense that I am speechless)

Another method is to allow ForkJoinPool to be placed in parallel in a workspace.parallelStream(). Unfortunately, the current JDK has not yet been implemented.

The meaning of the story

Parallel streams is unpredictable and it is complicated to use it correctly. Almost any use of parallel streams affects the performance of unrelated parts of the program and is unpredictable. I have no doubt that someone can try to use them correctly and effectively. However, before playing stream. parallel () in my code, I will still think carefully and review all the code containing it.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.