Netty source analysis to uncover the veil of reactor thread (i)

Source: Internet
Author: User

Netty is the core of the reactor thread, the corresponding project uses a wide range of nioeventloop, so what is the nioeventloop inside exactly what is doing? How does Netty ensure efficient polling of event loops and timely execution of tasks? And how to gracefully fix the NIO bug of the JDK? With these questions, this article will discovering to take you through the truth about Netty reactor threads [source code based on 4.1.6.Final]

Reactor thread Start-up

The Nioeventloop run method is the body of the reactor thread, which is started when the task is added for the first time

Nioeventloop the Execute method of the parent class Singlethreadeventexecutor

@Overridepublic void execute(Runnable task) {    ...    boolean inEventLoop = inEventLoop();    if (inEventLoop) {        addTask(task);    else {        startThread();        addTask(task);        ...    }    ...}

When an external thread executes while adding a task to the task queue startThread() , Netty will determine that the reactor line threads is not started, and if it is not, start the thread and add the task to the task queue.

private void startThread() {    if (STATE_UPDATER.get(this) == ST_NOT_STARTED) {        if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {            doStartThread();        }    }}

Singlethreadeventexecutor executes doStartThread , it calls the Execute method of the internal executor executor , encapsulates the process of calling the Nioeventloop run method into a runnable plug into a thread to execute

private void doStartThread() {    ...    executor.execute(new Runnable() {        @Override        public void run() {            thread = Thread.currentThread();            ...                SingleThreadEventExecutor.this.run();            ...        }    }}

The thread is executor created, corresponding to the Netty Reactor thread entity. executordefault isThreadPerTaskExecutor

By default, ThreadPerTaskExecutor each time the method is executed, execute DefaultThreadFactory a FastThreadLocalThread thread is created, and this thread is the reactor thread entity in Netty

Threadpertaskexecutor

public void execute(Runnable command) {    threadFactory.newThread(command).start();}

about why the ThreadPerTaskExecutor DefaultThreadFactory combination of and to new one FastThreadLocalThread , here is no longer described in detail, through the following sections of code to explain briefly

The standard Netty program calls the NioEventLoopGroup following code to the parent class MultithreadEventExecutorGroup

protected MultithreadEventExecutorGroup(int nThreads, Executor executor,                                        EventExecutorChooserFactory chooserFactory, Object... args) {    ifnull) {        new ThreadPerTaskExecutor(newDefaultThreadFactory());    }}

and pass it on to the newchild by the wayNioEventLoop

@Overrideprotected EventLoop newChild(Executor executor, Object... args) throws Exception {    returnnew NioEventLoop(this, executor, (SelectorProvider) args[0],        ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2]);}

About reactor thread creation and start-up first, we summarize: Netty's reactor thread is created when a task is added, the thread entity is FastThreadLocalThread (this thing will be the beginning of the article emphasis), the last thread to execute NioEventLoop the method of the main body run .

Reactor execution of Threads

So let's focus on NioEventLoop the Run method below.

@Override protected void run() { for(;;) {Try{Switch(Selectstrategy.calculatestrategy (Selectnowsupplier, Hastasks ())) { CaseSelectstrategy.continue:Continue; CaseSelectStrategy.SELECT:select (Wakenup.getandset (false));if(Wakenup.get ())                    {Selector.wakeup (); }default://Fallthrough} processselectedkeys ();            Runalltasks (...); }        }Catch(Throwable t)        {handleloopexception (t); }        ...    }

We pull out the trunk, the reactor thread is actually very simple, with the following picture can explain


Reactor action

Reactor threads probably do things that are divided into three steps in a continuous loop

1. First poll the IO events registered to the reactor thread on all channel on selector

select(wakenUp.getAndSet(false));if (wakenUp.get()) {    selector.wakeup();}

2. Handling channel that generates network IO events

processSelectedKeys();

3. Working with Task queues

runAllTasks(...);

Detailed instructions for each step below

Select operation
select(wakenUp.getAndSet(false));if (wakenUp.get()) {      selector.wakeup();}

wakenUpIndicates whether a blocking select operation should be awakened, and you can see that the Netty will be set to false before a new loop is made, wakeUp marking the start of a new loop, and we'll split the specific select operation separately.

1. Scheduled task deadline time is coming, interrupt this poll

int  selectcnt = 0 ; long  Currenttimenanos = System.nanotime (); long  Selectdeadlinenanos = Currenttimenanos + Delaynanos (currenttimenanos); for  (;;) {long  timeoutmillis = (Selectdeadlinenanos-currenttimenanos + 500000l )/1000000l ; if  (Timeoutmillis <= 0 ) {if  (selectcnt = = 0 ) {Selector.sele            Ctnow ();        selectcnt = 1 ;    } break ; }    ....}

As we can see, the select operation of the reactor thread in Nioeventloop is also a for loop, and in the first step of the For loop, if you find that the deadline event for a task in the current scheduled task queue is almost (<=0.5ms), jump out of the loop. Also, if a select operation has not been performed so far before jumping out (), it is if (selectCnt == 0) called once selectNow() , and the method returns immediately without blocking

Here's a point, Netty. The timer task queue is sorted by the delay time from small to large, the delayNanos(currentTimeNanos) method is to remove the delay time of the first scheduled task

protected long delayNanos(long currentTimeNanos) {    ScheduledFutureTask<?> scheduledTask = peekScheduledTask();    ifnull) {        return SCHEDULE_PURGE_INTERVAL;    }    return scheduledTask.delayNanos(currentTimeNanos); }

The details about Netty's task queue (including common tasks, scheduled tasks, tail task) will be followed by a separate article, not too much to unfold

2. A task was found to join during the polling process, which interrupted the poll

for (;;) {    // 1.定时任务截至事时间快到了,中断本次轮询    ...    // 2.轮询过程中发现有任务加入,中断本次轮询    if (hasTasks() && wakenUp.compareAndSet(falsetrue)) {        selector.selectNow();        1;        break;    }    ....}

Netty in order to ensure that the task queue can be executed in a timely manner, when the blocking select operation will determine whether the task queue is empty, if not empty, a non-blocking select operation, jump out of the loop

3. Blocking Select operation

for (;;) {    // 1.定时任务截至事时间快到了,中断本次轮询    ...    // 2.轮询过程中发现有任务加入,中断本次轮询    ...    // 3.阻塞式select操作    int selectedKeys = selector.select(timeoutMillis);    selectCnt ++;    if0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {        break;    }    ....}

Perform this step to indicate that the queue inside the Netty task queue is empty and that all scheduled task latencies have not yet reached (greater than 0.5ms), so there is a blocking select operation here, up to the deadline for the first scheduled task

Here, we can ask ourselves a question, if the delay of the first scheduled task is very long, say one hours, then there is no possibility that the thread has been blocked in the select operation, of course! But, as long as a new task is added during this time, the block will be released.

External thread calls the Execute method to add a task

@Overridepublic void execute(Runnable task) {     ...    // inEventLoop为false    ...}

Call the Wakeup method to wake up selector blocking

protected void wakeup(boolean inEventLoop) {    if (!inEventLoop && wakenUp.compareAndSet(falsetrue)) {        selector.wakeup();    }}

As you can see, when an external thread adds a task, it calls the wakeup method to wakeselector.select(timeoutMillis)

After blocking the select operation, Netty has made a series of state judgments to determine whether to interrupt this poll, and the conditions for interrupting this poll are

    • Poll to IO event ( selectedKeys != 0 )
    • Oldwakenup parameter is True
    • Task Queue with tasks ( hasTasks )
    • The first scheduled task is about to be executed ( hasScheduledTasks() )
    • User Active wakeup ( wakenUp.get() )

4. Troubleshooting the JDK's NiO bug

For a description of the bug see HTTP://BUGS.JAVA.COM/BUGDATABASE/VIEW_BUG.DO?BUG_ID=6595055)

This bug causes selector to remain empty polling, resulting in the CPU 100%,nio server is not available, in strict sense, netty not solve the JDK bug, but in a way to cleverly avoid the bug, as follows

LongCurrenttimenanos = System.nanotime (); for(;;) {//1. Scheduled task deadline time is coming, interrupt this poll...//2. A task was found to join during polling and interrupt this poll...//3. Blocking Select operationSelector.select (Timeoutmillis);//4. Solve the JDK's NiO bug    LongTime = System.nanotime ();if(Time-timeunit.milliseconds.tonanos (Timeoutmillis) >= Currenttimenanos) {selectcnt =1; }Else if(Selector_auto_rebuild_threshold >0&& selectcnt >= selector_auto_rebuild_threshold) {rebuildselector (); selector = This. selector;        Selector.selectnow (); selectcnt =1; Break;     } Currenttimenanos = time; ... }

Netty will selector.select(timeoutMillis) record the start time before each move currentTimeNanos , record the end time after select, and determine if the select operation lasts at least a timeoutMillis second (this will time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos change to time - currentTimeNanos >= TimeUnit.MILLISECONDS.toNanos(timeoutMillis) perhaps better understanding)
If the duration is greater than or equal to Timeoutmillis, the description is a valid poll, reset selectCnt flag, otherwise, indicates that the blocking method has not been blocked for such a long time, may trigger the JDK's empty polling bug, when the number of polling more than one threshold, the default is 512, We start rebuilding selector.

The setup code for the null polling threshold is as follows

int selectorAutoRebuildThreshold = SystemPropertyUtil.getInt("io.netty.selectorAutoRebuildThreshold"512);if (selectorAutoRebuildThreshold < MIN_PREMATURE_SELECTOR_RETURNS) {    0;}SELECTOR_AUTO_REBUILD_THRESHOLD = selectorAutoRebuildThreshold;

Let's briefly describe the process by which Netty rebuildSelector to fix the bug, and rebuildSelector the operation is simple: New selector, which transfers the channel previously registered to the old selector to the new selector. The skeleton after we have extracted the main code is as follows

Public   void rebuildselector() {FinalSelector oldselector = Selector;FinalSelector Newselector; Newselector = Openselector ();intNchannels =0;Try{ for(;;) { for(Selectionkey Key:oldSelector.keys ()) {Object a = key.attachment ();if(!key.isvalid () | | Key.channel (). Keyfor (newselector)! =NULL) {Continue; }intInterestops = Key.interestops ();                     Key.cancel (); Selectionkey NewKey = Key.channel (). Register (Newselector, Interestops, a);if(AinstanceofAbstractniochannel) {((Abstractniochannel) a). Selectionkey = NewKey;                } nchannels + +; } Break; }    }Catch(Concurrentmodificationexception e) {//Probably due to concurrent modification of the key set.        Continue;    } selector = Newselector; Oldselector.close ();}

First, the openSelector() method creates a new selector and then executes a dead loop, restarting the transfer as long as there is a concurrent modification Selectionkeys exception during execution.

The specific transfer steps are

    1. Get a valid key
    2. Cancels the event registration of the key on the old selector
    3. Register the channel corresponding to the key on the new selector
    4. Rebind the relationship of channel and new key

Once the transfer is complete, the original selector can be discarded and all subsequent polls are made in the new selector

Finally, we summarize what the reactor thread select step does: constantly polling for an IO event, and constantly checking for scheduled and common tasks during polling, ensuring that tasks in the Netty task queue are effectively executed, The polling process, incidentally, uses a counter to avoid the JDK empty polling bug, the process is clear

Due to space reasons, the following two processes will be put in one article to tell, please look forward to

Process selected keys

Not to be continued

Run tasks

Not to be continued

Finally, by the beginning of the article, we are once again familiar with Netty's reactor thread.


Reactor action
    1. Poll IO Events
    2. Handling Polled events
    3. Perform tasks in the task queue
    Recommended expand Reading

    Netty source analysis to uncover the veil of reactor thread (i)

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.