Something less common in Java (4)-Fork/join

Last Update:2018-07-26 Source: Internet

Author: User

Tags volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

"Less common in Java" This module has not been written for a long time, today write a Java fork/join distributed processing mode. Fork/join is introduced in JDK1.7, it can achieve simple map-reduce operation in some way. The author currently organized some blog for the interview are ultra-high frequency. You can click on the link: http://blog.csdn.net/u012403290. Technical Points

1, Map-reduce
The programming model for processing large data is divided into "map" and "Reduce (approximate)" two parts. When applied to distributed programming, you can increase the efficiency and speed of the operation as much as possible. Popular is to put a very large task, split into a lot of small tasks, and then have their own threads to deal with these small tasks, and finally unify the results.

2. Produce background
In fact, fork/join processing a certain degree of data, the core is based on the current level of development of multi-core computer technology, it expresses a full use of resources concept. In today's computer domain multi-core processors are already mainstream, and concurrent programming is fastidious multithreading problem, the use of computer resources to reach a new height. Fork/join Structure

The correct use of Fork/join framework, must be familiar with its structure, for a distributed task, must have two conditions: ① task scheduling; ② task execution. In Fork/join, we mainly use it to customize the line pool submit tasks and scheduling tasks, called: Forkjoinpool, and we have its own task execution class, called: Forkjointask.

But instead of using forkjointask directly to perform and decompose tasks directly, we generally use its two subclasses,recursiveaction and recursivetask, where, The former mainly deals with tasks that do not return results, which mainly deal with tasks that return results. To sum up, this is the basic model of Fork/join:

Let's look at some of their structures in one part of the following:

①forkjoinpool:
Many explanations on the Internet Forkjoinpool source code is very old, in the JDK1.8 has no longer continue to maintain the forkjointask and forkjoinworkerthread these two arrays, the former is a task, the latter is the thread to perform the task. Its current pattern is to form an inner class: Workqueue, and here is its source in JDK1.8:

  /** * Queues supporting work-stealing as AS, external task * submission.
     Above for descriptions and algorithms. * Performance on most platforms are very sensitive to placement of * instances of both workqueues and their-- We absolutely * do not want multiple Workqueue instances or multiple queue * Arrays cache sharing.
     The @Contended annotation alerts * JVMs to try to keep instances apart.    * * @sun. misc.contended static Final class Workqueue {//Instance fields volatile int scanstate; versioned, <0:inactive;             odd:scanning int stackpred;               Pool Stack (CTL) predecessor int nsteals;                  Number of steals int hint;                randomization and stealer index hint int config;        Pool index and mode volatile int qlock; 1:locked, < 0:terminate;         else 0 volatile int base; Index OF Next slot for poll int top;   Index of next slot for push forkjointask<?>[] array;   The elements (initially unallocated) final Forkjoinpool pool; The containing pool (May is null) final forkjoinworkerthread owner;    Owning thread or null if shared volatile thread Parker; = = Owner during call to Park;  else null volatile forkjointask<?> currentjoin; Task being joined in awaitjoin volatile forkjointask<?> currentsteal; Mainly used by Helpstealer}

Carefully read the source code we found that the current structure is completely different from the original. We would have to distribute the task from the Forkjointask array to Forkjoinworkerthread to execute it. And now, with an inner class workqueue to accomplish this task, there is a forkjoinworkerthread in the workqueue that represents the performer of the queue, while in the Workqueue member variable, We found a forkjointask array, which is the task that this thread needs to perform.

Reading the description of this inner class, we find that this queue also supports thread task theft , what is called thread task theft. It means that you and one of your partners eat fruit, your share of the finished, he did not eat up, then you secretly took some of his fruit to eat. There is a child thread that executes 2 tasks, here to be said to exist a,b two workqueue in the execution of a task, A's task is finished, B's task is not executed, Then the workqueue of a is taken from the forkjointask array of B's workqueue to perform a part of the tail task, which can improve the efficiency of operation and calculation reasonably.

We do not understand the source code, this is not the intention of this blog post. Next, let's look at several ways to submit tasks in Forkjoinpool:

A, submit

    /**
     * Submits a forkjointask for execution.
     *
     @param task The task to submit
     * @param <T> the type of the task ' s result
     * @return the task
     * Throws NullPointerException if the task is null
     * @throws rejectedexecutionexception If the task cannot to be
     *         s cheduled for Execution
     * * Public
    <T> forkjointask<t> submit (forkjointask<t> Task) {
        if (task = null)
            throw new NullPointerException ();
        Externalpush (Task);
        return task;
    }

B, execute

    /**
     * arranges for (asynchronous) execution of the given task.
     *
     @param task The task
     * @throws NullPointerException if the task is null
     * @throws Rejectedexecutionexcepti On if the task cannot is
     *         scheduled for execution
    /public void execute (forkjointask<?> task) {
  if (task = null)
            throw new NullPointerException ();
        Externalpush (Task);
    }

C, invoke

    /** * Performs the given task, returning its result upon completion.  * If the computation encounters an unchecked Exception or Error, * It's rethrown as the outcome of this invocation. Rethrown * Exceptions behave in the same way as regular exceptions, but, * when possible, contain stack traces (as displayed for example * using {@code ex.printstacktrace ()}) of both the current thread * as OK as the thre
     Ad actually encountering the exception;
     * Minimally only the latter.
     * @param task The task * @param <T> the task ' s result * @return the task's result         * @throws NullPointerException If the task is null * @throws rejectedexecutionexception If the task cannot to be * scheduled for execution */public <T> T Invoke (forkjointask<t> Task) {if (Task = = Nu
        ll) throw new NullPointerException ();
        Externalpush (Task);
return Task.join ();    }

These 3 task submission methods are still different, and after submitting a task in submit, the task asynchronously starts and returns the task, while execute asynchronously performs the task without any return. Invoke asynchronously starts the task and returns a result directly.

②forkjointask:
In Forkjointask we will simply introduce the fork and join these two operations, the following is the source code of the fork method:

    Public Methods/** * Arranges to asynchronously execute this task in the pool the ' * Current task is  Running in, if applicable, or using the ' {@link * Forkjoinpool#commonpool ()} if not {@link #inForkJoinPool}. While * It isn't necessarily enforced, it is a usage error to fork A * task more than once unless it has comple  Ted and been * reinitialized. Subsequent modifications to the "this *" task or any data it operates in are not necessarily * consistent Ly observable by any thread than the "one * executing it unless preceded by a call to {@link #join} or * re
     Lated methods, or a call to {@link #isDone} returning {@code * true}. 
        * * @return {@code This}, to simplify usage/public final forkjointask<v> fork () {Thread T; if ((t = Thread.CurrentThread ()) instanceof Forkjoinworkerthread) ((forkjoinworkerthread) t). Workqueue
     . push (this);//Add the current thread to the Workqueue   else ForkJoinPool.common.externalPush (this);//Direct execution of this task return this; }

In the fork method, it first determines whether the current thread belongs to the Forkjoinworkerthread thread and, if it belongs to the thread, adds the thread to the Workqueue, or executes the task directly.

Here is the Join method:

    /** * Returns The result of the computation when it {@link #isDone was * done}. This method differs the from {@link #get ()} into that * Abnormal completion results in {@code RuntimeException} or * {  @code Error}, not {@code executionexception}, and which * interrupts of the calling thread do <em>not</em>
     Cause the * method to abruptly return by throwing {@code * interruptedexception}.
        * * @return The computed result */public final V join () {int s;
        if (s = Dojoin () & Done_mask)!= Normal) to determine whether the task is normal, or to report an exception reportexception (s); return Getrawresult ()//Returns results}/** * Implementation for join, GET, quietlyjoin.  Directly handles * cases of already-completed, external wait, and * unfork+exec.
     Others are relayed to forkjoinpool.awaitjoin. * * @return Status upon completion */private int dojoin () {int s; Thread T; Forkjoinworkerthread WT;
        Forkjoinpool.workqueue W; Return (s = status) < 0?
            S: ((t = Thread.CurrentThread ()) instanceof Forkjoinworkerthread)?
            (w = (WT = (forkjoinworkerthread) t). Workqueue). Tryunpush (This) && (s = doexec ()) < 0?
    S:wt.pool.awaitjoin (W, this, 0L): Externalawaitdone ();
        Final int doexec () {int S; Boolean completed;
            if ((s = status) >= 0) {try {completed = exec ();
            catch (Throwable Rex) {return setexceptionalcompletion (Rex); 
    } if (completed) s = setcompletion (normal);//If the task is finished, set to NORMAL} return s; }

The operation of the join is primarily to determine the execution status and return results of the current task, with four types of task status: Completed (NORMAL), canceled (cancelled), signal (SIGNAL), and occurrence of an exception (exceptional).
In the Dojoin () method, first by looking at the status of the task, through the Doexec method to determine whether the task is completed, if done, then directly return to the task status, if not finished, wait to continue execution. If the task completes successfully, set the task status to normal, and if an exception occurs, you need to report an exception. Realizing large data calculation with code implementation Fork/join

If you really want to be very detailed to introduce Fork/join source code, seemingly need to further study, many of the underlying things also involve some optimistic lock. We're not going to go into it, we're trying to use fork/join to compute the big series, and we're trying to compare it to the general calculation and see which is more efficient.

Demand:
Calculate the 1+2+3+........+n and

The following is my implementation of the use of Fork/join calculation, the main core idea is to divide the large calculation into small calculation, popular is to put a huge task split into many small tasks, the following is the core calculation model:

Here is the code implementation:

Package com.brickworkers;
Import Java.util.concurrent.ForkJoinPool;

Import Java.util.concurrent.RecursiveTask; The public class Fockjointest extends recursivetask<long>{//inherits Recursivetask to implement//set up a maximum compute capacity private final int DE


    Fault_capacity = 10000;

    Use 2 digits to indicate the current range to be computed private int start;

    private int end;
        Public fockjointest (int start, int end) {This.start = start;
    This.end = end;
        @Override protected Long Compute () {//Implementation compute method//is divided into two cases, long sum = 0;
                If the amount of the task is within the maximum capacity if (End-start < default_capacity) {for (int i = start; i < end; i++) {
            sum + = i;
            }else{//if the maximum capacity is exceeded, the split processing is done to compute the middle value of the capacity int middle = (start + end)/2;
            Recursive fockjointest fockJoinTest1 = new Fockjointest (start, middle);
            Fockjointest fockJoinTest2 = new Fockjointest (middle + 1, end); Perform a task FockJointest1.fork ();
            Fockjointest2.fork ();
        Wait for the task to execute and return the result sum = Fockjointest1.join () + Fockjointest2.join ();
    return sum;
        public static void Main (string[] args) {Forkjoinpool forkjoinpool = new Forkjoinpool ();
        Fockjointest fockjointest = new Fockjointest (1, 100000000);
        Long fockhoinstarttime = System.currenttimemillis ();
        As we said earlier, invoke in task submission can directly return results long result = Forkjoinpool.invoke (fockjointest);

        SYSTEM.OUT.PRINTLN ("Fock/join calculation result time-consuming" + (System.currenttimemillis ()-fockhoinstarttime));
        Long sum = 0;
        Long normalstarttime = System.currenttimemillis ();
        for (int i = 0; i < 100000000 i++) {sum = i;
    SYSTEM.OUT.PRINTLN ("General calculation results are time-consuming" + (System.currenttimemillis ()-normalstarttime));



 }//Execution result://fock/join calculation results time consuming 33//Normal calculation result takes 141

Note that in the above example, the efficiency of the program is actually the first you set the Default_capacity effect, if you set the capacity value is too small, then it will be decomposed into many many subtasks, then the efficiency will be reduced. However, the capacity to set a slightly larger efficiency will also be relatively elevated, after testing, running time and default_capcity relationship is roughly the following figure:
Tail Note

In our daily development, many places can be distributed to achieve it, of course, this is to build your resources in a very rich situation. For example, timed tasks, in the middle of the night when the resources are abundant, then we can use this way to speed up the efficiency of the operation. For example, the export of Project report file, we can take a part of the data of super multiple lines apart, can also achieve the effect of speeding up efficiency. Everybody can try.

I hope it will be of some help to you.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More