ZeroMQ(java)中串連建立與重連機制

最後更新：2018-07-27 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

前面的一篇文章分析了ZeroMQ中最為簡單Socket類型，Dealer。。不過覺得這種具體的Socket類型的分析可以留到以後，或者等以後什麼時候會用到了再分析再不遲。。。。

但是作為一個訊息通訊的架構，最重要的還是通訊的可靠性，而這其中最最重要的就是串連斷開之後的重串連機制。。。

在看具體的重串連機制之前，先來看看ZeroMQ中如何主動的建立於遠端串連吧，先來看看SocketBase中定義的connect方法：

 //與遠程地址建立串連    public boolean connect (String addr_) {        if (ctx_terminated) {            throw new ZError.CtxTerminatedException();        }        //  Process pending commands, if any.        boolean brc = process_commands (0, false);        if (!brc)            return false;        //  Parse addr_ string.        URI uri;        try {            uri = new URI(addr_);   //構建URI對象        } catch (URISyntaxException e) {            throw new IllegalArgumentException(e);        }                String protocol = uri.getScheme();   //擷取協議類型        String address = uri.getAuthority();        String path = uri.getPath();        if (address == null)            address = path;        check_protocol (protocol);  //檢查是否是合格的協議類型        if (protocol.equals("inproc")) {    //如果是進程內部的通訊            //  TODO: inproc connect is specific with respect to creating pipes            //  as there's no 'reconnect' functionality implemented. Once that            //  is in place we should follow generic pipe creation algorithm.            //  Find the peer endpoint.            Ctx.Endpoint peer = find_endpoint (addr_);            if (peer.socket == null)                return false;            // The total HWM for an inproc connection should be the sum of            // the binder's HWM and the connector's HWM.            int  sndhwm = 0;            if (options.sndhwm != 0 && peer.options.rcvhwm != 0)                sndhwm = options.sndhwm + peer.options.rcvhwm;            int  rcvhwm = 0;            if (options.rcvhwm != 0 && peer.options.sndhwm != 0)                rcvhwm = options.rcvhwm + peer.options.sndhwm;            //  Create a bi-directional pipe to connect the peers.            ZObject[] parents = {this, peer.socket};            Pipe[] pipes = {null, null};            int[] hwms = {sndhwm, rcvhwm};            boolean[] delays = {options.delay_on_disconnect, options.delay_on_close};            Pipe.pipepair (parents, pipes, hwms, delays);            //  Attach local end of the pipe to this socket object.            attach_pipe (pipes [0]);            //  If required, send the identity of the peer to the local socket.            if (peer.options.recv_identity) {                Msg id = new Msg (options.identity_size);                id.put (options.identity, 0 , options.identity_size);                id.set_flags (Msg.identity);                boolean written = pipes [0].write (id);                assert (written);                pipes [0].flush ();            }                        //  If required, send the identity of the local socket to the peer.            if (options.recv_identity) {                Msg id = new Msg (peer.options.identity_size);                id.put (peer.options.identity, 0 , peer.options.identity_size);                id.set_flags (Msg.identity);                boolean written = pipes [1].write (id);                assert (written);                pipes [1].flush ();            }            //  Attach remote end of the pipe to the peer socket. Note that peer's            //  seqnum was incremented in find_endpoint function. We don't need it            //  increased here.            send_bind (peer.socket, pipes [1], false);            // Save last endpoint URI            options.last_endpoint = addr_;            // remember inproc connections for disconnect            inprocs.put(addr_, pipes[0]);            return true;        }        //選擇一個比較IO線程，用於部署待會將會建立愛的session        IOThread io_thread = choose_io_thread (options.affinity);        if (io_thread == null) {            throw new IllegalStateException("Empty IO Thread");        }        //建立address對象        Address paddr = new Address (protocol, address);        if (protocol.equals("tcp")) {  //如果是tcp的話            paddr.resolved( new  TcpAddress () );            paddr.resolved().resolve (                address, options.ipv4only != 0 ? true : false);        } else if(protocol.equals("ipc")) {  //處理序間通訊            paddr.resolved( new IpcAddress () );            paddr.resolved().resolve (address, true);        }        //  Create session.        //建立session，第一參數是當前session將會依附的IO線程，第二個參數表示需要主動建立串連        SessionBase session = SessionBase.create (io_thread, true, this,            options, paddr);        assert (session != null);        //  PGM does not support subscription forwarding; ask for all data to be        //  sent to this pipe.        boolean icanhasall = false;        if (protocol.equals("pgm") || protocol.equals("epgm"))            icanhasall = true;        //建立pipe的關聯，串連session與當前的socket        if (options.delay_attach_on_connect != 1 || icanhasall) {            //  Create a bi-directional pipe.            ZObject[] parents = {this, session};            Pipe[] pipes = {null, null};            int[] hwms = {options.sndhwm, options.rcvhwm};            boolean[] delays = {options.delay_on_disconnect, options.delay_on_close};            Pipe.pipepair (parents, pipes, hwms, delays);            //  Attach local end of the pipe to the socket object.            //將第一個pipe與當前socket關聯            attach_pipe (pipes [0], icanhasall);            //  Attach remote end of the pipe to the session object later on.            //將另外一個pipe與session關聯起來，這樣session與socket就能夠通過pipe通訊了            session.attach_pipe (pipes [1]);        }                // Save last endpoint URI        options.last_endpoint = paddr.toString ();        add_endpoint (addr_, session);  //將這個session與這個地址關聯起來        return true;    }

這裡就主要關注TCP串連的建立部分吧，畢竟在分布式的環境下還是再用TCP，通過前面的文章，我們知道一個Socket下面可能對應了多個串連，而每一個串連其實對應的是一個StreamEngine對象，而每一個StreamEngine對象又都關聯了一個Session對象，用於與上層的Socket之間的互動，那麼這裡其實可以看到代碼最主要要做的事情就是建立Session對象，以及Pipe對象啥的。。。。接著再調用add_endpoint方法，用於部署這個session，那麼接下來來看看這個方法吧：

    //這裡管理地址與session，其實也就記錄當前所有的建立串連的地址，以及相對的session    private void add_endpoint (String addr_, Own endpoint_) {        //  Activate the session. Make it a child of this socket.        launch_child (endpoint_);   //部署這個endpoint，這裡主要的是要將這個endpoint加入到IO線程        endpoints.put (addr_, endpoint_);    }

這裡其實用於不是session對象，那麼對於這個session對象，將會執行process_plug方法，那麼來看看這個方法的定義：

    //執行plug命令,如果需要串連的話，那麼要開始進行串連    protected void process_plug () {        io_object.set_handler(this);  //設定io對象的handler，用於響應io事件        if (connect) {  //如果這裡需要主動與遠程建立串連的話，那麼啟動串連            start_connecting (false);   //啟動串連，false表示不等待        }    }

這裡首先會設定當前io對象的事件回調，connect屬性，在建立session的時候設定的，如果是主動建立的串連那麼將會是true，如果是listener接收到的串連，那麼將會是false，這裡來看看這個方法的定義：

    //如果是connect的話，那麼需要調用這個方法來建立串連    private void start_connecting (boolean wait_) {        assert (connect);        //  Choose I/O thread to run connecter in. Given that we are already        //  running in an I/O thread, there must be at least one available.        IOThread io_thread = choose_io_thread (options.affinity);  //挑選一個io線程，用於部署待會的TCPConnector        assert (io_thread != null);        //  Create the connecter object.        if (addr.protocol().equals("tcp")) {            TcpConnecter connecter = new TcpConnecter (                io_thread, this, options, addr, wait_);            //alloc_assert (connecter);            launch_child (connecter);  //部署這個TCPconnector            return;        }                if (addr.protocol().equals("ipc")) {            IpcConnecter connecter = new IpcConnecter (                io_thread, this, options, addr, wait_);            //alloc_assert (connecter);            launch_child (connecter);            return;        }                assert (false);    }

這裡傳進來了一個參數，這個參數在構建TCPConnector的時候將會被用到，用於表示這個串連的建立是否是延遲的。。這裡剛開始建立串連的時候，是false，表示不要延遲，待會看重串連的時候會發現，在重串連中將會使用延遲的串連。。。

這裡也可以看到對於具體串連的建立，其實是委託給了TCPConnector對象來做的，它其實是一個工具類。。。

具體它是怎麼建立串連的就不詳細的列出來了，大概的說一下過程吧：

（1）建立一個socketchannel對象，並將其設定為非阻塞的，然後調用connect方法來建立於遠程地址的串連

（2）將socketchannel註冊到IO線程的poller上去，並要設定connect事件

（3）對於connect事件的回調要做的事情，其實是在poller對象上解除這個socketchannel的註冊，然後建立一個新的streamengine對象來封裝這個socketchannel，然後再將這個streamEngine對象與剛剛的session對象關聯起來。。

這裡我們可以來看看這個connect的事件回調方法做了什麼事情吧：

    //串連建立的事件回調，其實也有可能是連線逾時    public void connect_event (){        boolean err = false;        SocketChannel fd = null;        try {            fd = connect ();   //擷取已經建立好串連的channel        } catch (ConnectException e) {            err = true;        } catch (SocketException e) {            err = true;        } catch (SocketTimeoutException e) {            err = true;        } catch (IOException e) {            throw new ZError.IOException(e);        }        io_object.rm_fd (handle);  //可以將當前的IOObject從poller上面移除了，同時代表這個TCPConnector也就失效了，        handle_valid = false;                if (err) {            //  Handle the error condition by attempt to reconnect.            close ();               add_reconnect_timer();  //嘗試重建立立串連            return;        }                handle = null;                try {                        Utils.tune_tcp_socket (fd);            Utils.tune_tcp_keepalives (fd, options.tcp_keepalive, options.tcp_keepalive_cnt, options.tcp_keepalive_idle, options.tcp_keepalive_intvl);        } catch (SocketException e) {            throw new RuntimeException(e);        }        //  Create the engine object for this connection.                //建立streamEngine對象，重新封裝建立好串連的channel         StreamEngine engine = null;        try {            engine = new StreamEngine (fd, options, endpoint);        } catch (ZError.InstantiationException e) {            socket.event_connect_delayed (endpoint, -1);            return;        }        //  Attach the engine to the corresponding session object.        send_attach (session, engine);  //將這個engine與session綁定起來，然後同時還會將當前streamEngine綁定到IO線程上，也就是在poller上面註冊        //  Shut the connecter down.        terminate ();  //關閉當前的connector        socket.event_connected (endpoint, fd);  //向上層的socket通知串連建立的訊息    }

具體幹了什麼代碼很直白的就能看出來吧，這裡還可以看到對於建立連線逾時也會進行嘗試重串連的。。。

好了，到這裡如何建立串連就算是比較的清楚了。。那麼接下來看看在串連斷開之後將會如何進行重串連吧，先來看看串連斷開之後會執行啥操作，

這裡首先總得知道如何判斷底層的channel的串連是不是已經斷開了吧，如何來判斷呢，嗯，這個有點基礎的就應該知道，如果串連已經斷開了，那麼在channel上read將會返回-1，好了那麼我們就知道代碼應該從哪裡開始看了，嗯，來看streamEngine的in_event方法，看它在read返回-1之後會做啥：

    //當底層的chanel有資料可以讀取的時候的回調方法    public void in_event ()  {        if (handshaking)            if (!handshake ())                return;                assert (decoder != null);        boolean disconnection = false;        //  If there's no data to process in the buffer...        if (insize == 0) {  //如果inbuf裡面沒有資料需要處理            //  Retrieve the buffer and read as much data as possible.            //  Note that buffer can be arbitrarily large. However, we assume            //  the underlying TCP layer has fixed buffer size and thus the            //  number of bytes read will be always limited.            inbuf = decoder.get_buffer ();  //從解碼器裡面擷取buf，用於寫入讀取的資料，因為在已經設定了底層socket的TCP接收緩衝區的大小            insize = read (inbuf);  //用於將發送過來的資料寫到buf中去，並記錄大小            inbuf.flip();  //這裡準備從buf裡面讀取資料了            //  Check whether the peer has closed the connection.            if (insize == -1) {  //如果是-1的話，表示底層的socket串連已經出現了問題                insize = 0;                disconnection = true;  //設定標誌位            }        }        //  Push the data to the decoder.        int processed = decoder.process_buffer (inbuf, insize);  //解析這些讀取到的資料        if (processed == -1) {            disconnection = true;        } else {            //  Stop polling for input if we got stuck.            if (processed < insize)  //如果處理的資料居然還沒有讀到的資料多，那麼取消讀取事件的註冊                io_object.reset_pollin (handle);            //  Adjust the buffer.            insize -= processed;  //還剩下沒有處理的資料的大小        }        //  Flush all messages the decoder may have produced.        session.flush ();  //將decoder解析出來的資料交給session        //  An input error has occurred. If the last decoded message        //  has already been accepted, we terminate the engine immediately.        //  Otherwise, we stop waiting for socket events and postpone        //  the termination until after the message is accepted.        if (disconnection) {   //表示已經斷開了串連，那麼需要處理一下            if (decoder.stalled ()) {                io_object.rm_fd (handle);                io_enabled = false;            } else {                error ();                    }        }    }

嗯，這裡可以看到，如果返回-1之後，會設定disconnection標誌位，然後還會調用error方法來報錯，那麼接下來來看看這個error方法做了啥吧：

    //報錯，那麼讓高層的ZMQ的socket關閉當前串連    private void error ()  {        assert (session != null);        socket.event_disconnected (endpoint, handle);  //這裡可以理解為通知上層的socket，        session.detach ();   //這個主要是用於session清理與socket的pipe ，然後還會嘗試進行重串連        unplug ();  //取消在poller上面的註冊        destroy ();  //關閉底層的channel，關閉當前    }

其實，如果底層的連結斷開了，那麼當前這個channel也就無效了，那麼當前的streamEngine對象也就無效了，那麼要做的事情就是銷毀當前的對象，然後還要解除在poller上面的註冊，然後還要通知上層的socket，當前的這個連結地址的串連已經斷開了。。。當然還要告訴session對象，讓其進行一些處理，session的處理就包括重串連了，那麼來看看他做了啥：

    //相當於是要移除底層的Engine的關聯    public void detach()  {        //  Engine is dead. Let's forget about it.        engine = null;  //這裡相當於就會釋放當前的engine對象        //  Remove any half-done messages from the pipes.        clean_pipes ();  //清除那些沒有接受完的msg         //  Send the event to the derived class.        detached ();   //取消pipe，然後重串連        //  Just in case there's only a delimiter in the pipe.        if (pipe != null)            pipe.check_read ();    }

這裡還看不到進行重串連的代碼，接下來繼續看detached方法：

    private void detached() {        //  Transient session self-destructs after peer disconnects.        if (!connect) {  //如果不是主動建立串連的話，那麼就直接終止就好了否則的話還進行重串連的嘗試            terminate ();            return;        }        //  For delayed connect situations, terminate the pipe        //  and reestablish later on        if (pipe != null && options.delay_attach_on_connect == 1            && addr.protocol () != "pgm" && addr.protocol () != "epgm") {            pipe.hiccup ();            pipe.terminate (false);            terminating_pipes.add (pipe);            pipe = null;        }                reset ();  // 複位標誌位        //這裡主動進行重串連的嘗試        if (options.reconnect_ivl != -1) {            start_connecting (true);   //進行重串連嘗試，這裡也就是需要進行一些延遲        }        //  For subscriber sockets we hiccup the inbound pipe, which will cause        //  the socket object to resend all the subscriptions.        if (pipe != null && (options.type == ZMQ.ZMQ_SUB || options.type == ZMQ.ZMQ_XSUB))            pipe.hiccup ();    }

這裡可以看到調用了start_conneting方法，不過這裡傳進去的參數是true，具體的執行流程與上面建立串連差不多，只不過這裡是延遲進行串連的。。。

也就是會在IO線程上面設定定時，當逾時之後才會進行串連。。。這樣也就使得重串連在一定的頻率內進行。。。

具體的定時就不細講了，蠻簡單的。。。

通過上面的代碼可以知道ZeroMQ在串連斷開之後，如果這個串連時自己主動建立的，而不是listener擷取的，那麼會自動的去嘗試進行重串連。。嗯，做的還不錯。。

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More