Kafka Source Depth Analysis-sequence 3-producer-java Nio_

Kafka Source Depth Analysis-sequence 3-producer-java Nio__nio

Last Update:2018-07-27 Source: Internet

Author: User

Tags epoll prepare

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the last article we analyzed the metadata update mechanism, which involves a problem, that is, sender how to communicate with the server, that is, the network layer. Like many Java projects, the Kafka client's network layer is also used for Java NIO, which is then encapsulated in the above layer.

Let's take a look at the section between the sender and the server:

As you can see, the Kafka client encapsulates a network layer based on Java NIO, and the top-level interface of the network layer is kakfaclient. Its hierarchical relationships are as follows:

In this article, we'll start with a detailed account of Java NiO at the bottom of the story. 4 Large components of NiO Buffer and channel

Channel: In the usual Java network programming, we know that there are a pair of Socket/serversocket objects, every 1 Sockets object represents a Connection,serversocket for the server to listen for new connections.
In NiO, the corresponding pair is socketchannel/serversocketchannel.

The following figure shows the class inheritance hierarchy of Socketchannel/serversocketchannel

Public interface Channel extends Closeable {public
    boolean isOpen ();
    public void Close () throws IOException
}

Public interface Readablebytechannel extends Channel {public
    int read (Bytebuffer DST) throws IOException;

Public interface Writablebytechannel extends Channel {public
    int write (Bytebuffer src) throws IOException;

From the code can be seen, a channel most basic operation is read/write, and it must be passed into the Bytebuffer type, rather than the ordinary memory buffer.

Buffer: In NiO, there are also 1 sets of class inheritance levels around buffer, which are not detailed here. Just know that the buffer is used to encapsulate the channel Send/Receive data. Selector

The main purpose of selector is the loop loop of network events, which continuously polls each channel for read and write events by invoking Selector.poll Selectionkey

Selectionkey is used to record a collection of events on a channel, each channel corresponding to a selectionkey.
Selectionkey is also an association between selector and channel, which can be selector and channel by Selectionkey.

The cooperation and cooperation of these 4 components are described in detail below. Epoll and IOCP of 4 kinds of network IO models

The following 4 IO models are described in advanced Programming for UNIX environments (more than 4, but commonly used in these 4 types):

Blocking the Io:read/write when the call is blocked

Non-blocking io:read/write, no data, immediately returned, polling

IO multiplexing: Read/write can only listen to one socket at a time, but for the server, there are thousands of socket connections, how to use a function, you can listen to all the socket above the reading and writing events. This is the IO multiplexing model, corresponding to Linux above, is the SELECT/POLL/EPOLL3 technology.

Not on the asynchronous Io:linux, which corresponds to IOCP on Windows. reactor mode vs. Preactor mode

I believe many people have heard of the network IO 2 design patterns, about the 2 models of specific elaboration, you can Google it.

Here, just want to do a "most popular explanation" for these 2 models:

Reactor Mode: Active mode, the so-called active, refers to the application constantly polling, ask the operating system, IO is ready. Linux under the Select/poll/epooll is active mode, need to have a loop in the application, has been to poll.
In this mode, the actual IO operation is done by the application.

Proactor mode: Passive mode, you give Read/write all to the operating system, the actual IO operation is done by the operating system, and then callback your application. Windows IOCP is the model, and then, for example, the ASIO library in C + + boost is a typical proactor pattern. Epoll Programming Model--3 stages

On Linux platforms, Java NiO is implemented on the basis of epoll. There are 3 phases of all epoll based frameworks:
Registers the event (connect,accept,read,write), polls Io is ready, performs the actual IO operation.

The following code shows the basic framework for programming with C language Epoll under Linux:

Phase 1: Invoke Epoll_ctl (XX) registration event for (;;)     {Nfds = epoll_wait (epfd,events,20,500); Phase 2: Poll all socket for (I=0;I&LT;NFDS;++I)//process Polling Results {if (EVENTS[I].DATA.FD==LISTENFD)//accept Event Ready {CONNFD = Accept (LISTENFD, (SOCKADDR *) &clientaddr, &clilen);//Phase 3: Perform actual IO operations, ACC
                EPT EV.DATA.FD=CONNFD; ev.events=epollin|
                Epollet; Epoll_ctl (Epfd,epoll_ctl_add,connfd,&ev); Back to Stage 1: Re-register} else if (Events[i].events&epollin)//Read ready {n = re     
                AD (SOCKFD, line, Maxline)) < 0//Phase 3: Performing the actual IO operation ev.data.ptr = MD; Ev.events=epollout|
                Epollet; Epoll_ctl (Epfd,epoll_ctl_mod,sockfd,&ev);  Back to Stage 1: Re-register event} else if (events[i].events&epollout)/write ready {struct    
              myepoll_data* MD = (myepoll_data*) events[i].data.ptr;  SOCKFD = md->fd;        Send (SOCKFD, Md->ptr, strlen ((char*) md->ptr), 0);
                Phase 3: Performing the actual IO operation EV.DATA.FD=SOCKFD; ev.events=epollin|
                Epollet; Epoll_ctl (Epfd,epoll_ctl_mod,sockfd,&ev); Back to Stage 1, re-register events} else {//Other processing}}}

Similarly, the selector in NiO also has the following 3 stages, the following compares the use of selector and Epoll:

As you can see, 2 are written in different ways, and the same is true of these 3 stages.

The following table shows the 4 events of Connect, accept, read, and write, each of which corresponds to the functions in these 3 phases:

Let's look at the core implementations of selector in Kafka client:

    @Override public void Poll (long timeout) throws IOException {... Clear ();
        Empty various states if (hasstagedreceives ()) timeout = 0;
        Long startselect = Time.nanoseconds ();  int readykeys = SELECT (Timeout);
        Polling Long endselect = Time.nanoseconds ();
        Currenttimenanos = Endselect;

        This.sensors.selectTime.record (Endselect-startselect, Time.milliseconds ());
            if (Readykeys > 0) {set<selectionkey> keys = This.nioSelector.selectedKeys ();
            Iterator<selectionkey> iter = Keys.iterator ();
                while (Iter.hasnext ()) {Selectionkey key = Iter.next ();
                Iter.remove ();

                Kafkachannel channel = channel (key);
                Register all Per-connection metrics at once Sensors.mayberegisterconnectionmetrics (Channel.id ());

                Lruconnections.put (Channel.id (), Currenttimenanos);
   try {                 if (key.isconnectable ()) {//Have connection event channel.finishconnect ();
                        This.connected.add (Channel.id ());
                    This.sensors.connectionCreated.record (); } if (channel.isconnected () &&!channel.ready ()) Channel.prepare (); This requires only SSL requirements for security checks, and the normal unencrypted channel,prepare () is null to implement if (Channel.ready () && key.isreadable () &
                        &!hasstagedreceive (Channel)) {//read ready networkreceive networkreceive; while ((networkreceive = Channel.read ())!= null) addtostagedreceives (channel, networkreceive );
                        Actual read Action} if (Channel.ready () && key.iswritable ()) {//write Ready Send send = Channel.write (); The actual write action if (send!= null) {THIS.COMPLETEDSENDS.ADD (send);
                        This.sensors.recordBytesSent (Channel.id (), send.size ());
                        }/* Cancel any defunct sockets */if (!key.isvalid ()) {
                        Close (channel);
                    This.disconnected.add (Channel.id ());
                    } catch (Exception e) {String desc = channel.socketdescription ();
                    if (e instanceof IOException) log.debug ("Connection with {} disconnected", DESC, E); else Log.warn ("unexpected error from {};
                    Closing connection ", DESC, E);
                    Close (channel);
                This.disconnected.add (Channel.id ());

        }} addtocompletedreceives ();
        Long Endio = Time.nanoseconds ();
  This.sensors.ioTime.record (Endio-endselect, Time.milliseconds ());      Maybecloseoldestconnection (); }

Epoll and selector differences in registration-lt&et mode LT & ET

We know that there are 2 modes in Epoll: LT (horizontal trigger) and ET (Edge trigger). Horizontal trigger, also called conditional trigger, edge trigger, or state trigger. What is the difference between these 2 species?

Here we introduce the "read/write buffer" concept of the socket:

Horizontal trigger (conditional trigger): Read events are always triggered as long as the read buffer is not empty; The write buffer will always trigger the write event as long as it is dissatisfied. This comparison conforms to the programming custom and is also the default mode of Epoll.

Edge trigger (State trigger): The state of the read buffer, triggered 1 times from idling to Non-empty, and the state of the write buffer, triggered 1 times from full to non full. For example, you send a large file, the write buffer is full, and then the buffer can be written, there will be a switch from full to dissatisfaction.

Through analysis, we can see:
For LT mode, avoid the "write Dead Loop" problem: Write buffer is full of probability is very small, that is, "write conditions" will always be satisfied, so if you register write event, no data to write, but it will always trigger, so in the LT mode, write the data, must cancel write event;

For the ET mode, avoid the "short read" problem: for example, you receive 100 bytes, it triggers 1 times, but you read only 50 bytes, the remaining 50 bytes do not read, it will not trigger again, this time the socket is discarded. So in the ET mode, be sure to read the "read buffer" data.

Another difference between LT and et is that LT applies to blocking and non-blocking io, and et only applies to non-blocking IO.

There is also a claim that the performance of ET is higher, but the programming is more difficult, error prone. In the end et performance, is not necessarily higher than LT, this is open to question, the need for actual test data to speak.

It says that Epoll uses the LT mode by default, while Java NiO uses the Epoll lt mode. Here's an analysis of the processing of Connect/read/write events in Java NIO. Registration of Connect Events

//selector public void Connect (String ID, inetsocketaddress address, int sendbuffersize, int Receivebuffersize) throws IOException {if (This.channels.containsKey (ID)) throw new Illegalstateexcep

        tion ("There is already a connection for ID" + ID);
        Socketchannel Socketchannel = Socketchannel.open ();
        。。。
        try {socketchannel.connect (address);
            catch (Unresolvedaddressexception e) {socketchannel.close ();
        throw new IOException ("Can ' t resolve address:" + address, E);
            catch (IOException e) {socketchannel.close ();
        Throw e;  } selectionkey key = Socketchannel.register (Nioselector, selectionkey.op_connect);
        When constructing channel, register connect event Kafkachannel Channel = Channelbuilder.buildchannel (ID, key, maxreceivesize);
        Key.attach (channel);
    This.channels.put (ID, channel); }

Cancel of Connect event

In the above poll function, the Connect event is ready, that is, connect connection complete, link resume
 if (key.isconnectable ()) {  //have connection event
       Channel.finishconnect (); 
                        ...
     }

 Plaintransportlayer public
 void Finishconnect () throws IOException {
        socketchannel.finishconnect ();  Call Channel Finishconnect ()
        key.interestops (Key.interestops () & ~selectionkey.op_connect | Selectionkey.op_read); Cancel Connect event, new Read Event Group book
    }

registration of the Read event

It can also be seen from the above that the registration of the Read event and the cancellation of the Connect event are simultaneous cancellation of the read event

Because read is to always listen to remote, whether there is new data coming, so will not cancel, has been listening. And because it is the LT mode, as long as the "read buffer" has data, it will always be triggered. Registration of Write events

Selector public
    void Send (send send) {
        Kafkachannel channel = Channelorfail (Send.destination ());
        try {
            channel.setsend (send);
        } catch (Cancelledkeyexception e) {
            This.failedSends.add ( ));
            Close (channel);
        }

Kafkachannel public
    void Setsend (send send) {
        if (this.send!= null)
            throw new IllegalStateException (" Attempt to begin a send operation with prior send operation still in progress. ");
        This.send = send;
        This.transportLayer.addInterestOps (selectionkey.op_write);  Every time you call send, register a write event
    }

cancellation of Write event

The poll function above
                    if (Channel.ready () && key.iswritable ()) {//write event ready
                        Send send = Channel.write ();// In this write, the write event
                        if (send!= null) {
                            This.completedSends.add (send) is canceled;
                            This.sensors.recordBytesSent (Channel.id (), send.size ());
                        }


    Private Boolean send (send send) throws IOException {
        Send.writeto (transportlayer);
        if (send.completed ())
            transportlayer.removeinterestops (selectionkey.op_write);  Cancel Write event return

        send.completed ();
    }

To sum up:
(1) The concept of "event ready" is somewhat ambiguous for different types of events.

Read event ready: The best understanding is that the remote has new data coming, need to read. This is the LT mode, as long as the read buffer has data, it will always be triggered.

Write event ready: What does this mean? Actually refers to the local socket buffer is not full. If not full, it will always trigger the write event. So to avoid the "write dead Loop" problem, finish writing, to cancel the write event.

Connect Event Ready: Connect connection Complete

Accept event Ready: There are new connections coming in, calling accept processing

(2) Different types of events are handled differently:

Connect event: Registered 1 times, after successful, it was canceled. There are and only 1 times

Read event: Not canceled after registration, always listening

Write event: Send, registered 1 times per call. Send succeeded, cancel registration

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More