Swarmkit notes (8)--agent.session

Source: Internet
Author: User
Tags sendmsg stream api
This is a creation in Article, where the information may have evolved or changed.

AgentThe manager communication between and is session carried out, the following is the agent.session structure definition:

// session encapsulates one round of registration with the manager. session// starts the registration and heartbeat control cycle. Any failure will result// in a complete shutdown of the session and it must be reestablished.//// All communication with the master is done through session.  Changes that// flow into the agent, such as task assignment, are called back into the// agent through errs, messages and tasks.type session struct {    agent     *Agent    sessionID string    session   api.Dispatcher_SessionClient    errs      chan error    messages  chan *api.SessionMessage    tasks     chan *api.TasksMessage    registered chan struct{} // closed registration    closed     chan struct{}}

(1) registered channel is used to notify that the agent registration has been manager successful:

func (s *session) run(ctx context.Context, delay time.Duration) {    time.Sleep(delay) // delay before registering.    if err := s.start(ctx); err != nil {        select {        case s.errs <- err:        case <-s.closed:        case <-ctx.Done():        }        return    }    ctx = log.WithLogger(ctx, log.G(ctx).WithField("session.id", s.sessionID))    go runctx(ctx, s.closed, s.errs, s.heartbeat)    go runctx(ctx, s.closed, s.errs, s.watch)    go runctx(ctx, s.closed, s.errs, s.listen)    close(s.registered)}

session.runfunction, if session.start() there is no problem running, it will be in the last close registered one channel . And in the Agent.Run() :

func (a *Agent) run(ctx context.Context) {        .....        session    = newSession(ctx, a, backoff) // start the initial session        registered = session.registeredfor {        select {            ......        case <-registered:            log.G(ctx).Debugln("agent: registered")            if ready != nil {                close(ready)            }            ready = nil            registered = nil // we only care about this once per session            backoff = 0      // reset backoff            sessionq = a.sessionq            ......    }}

Once it is registered close , <-registered this case will be executed immediately.

(2) When session there is an error in the operation, it will be error sent to errs channel . In the Agent.Run() :

case err := <-session.errs:        // TODO(stevvooe): This may actually block if a session is closed        // but no error was sent. Session.close must only be called here        // for this to work.        if err != nil {            log.G(ctx).WithError(err).Error("agent: session failed")            backoff = initialSessionFailureBackoff + 2*backoff            if backoff > maxSessionFailureBackoff {                backoff = maxSessionFailureBackoff            }        }        if err := session.close(); err != nil {            log.G(ctx).WithError(err).Error("agent: closing session failed")        }        sessionq = nil        // if we're here before <-registered, do nothing for that event        registered = nil        // Bounce the connection.        if a.config.Picker != nil {            a.config.Picker.Reset()        }

erroronce received, this will be closed session and some cleanup work done.

(3) messages channel to receive manager agent the message sent to the Agent.run() function for processing:

case msg := <-session.messages:        if err := a.handleSessionMessage(ctx, msg); err != nil {            log.G(ctx).WithError(err).Error("session message handler failed")        }

(4) tasks channel to receive manager the information sent to the agent need to run on this node , the task same needs to be transferred to the Agent.run() function for processing:

case msg := <-session.tasks:        if err := a.worker.Assign(ctx, msg.Tasks); err != nil {            log.G(ctx).WithError(err).Error("task assignment failed")        }

(5) closed channel session.close() is closed in the function. That is case err := <-session.errs: , in this branch, it executes. Once it closed channel is closed, the connection is re-established:

case <-session.closed:        log.G(ctx).Debugf("agent: rebuild session")        // select a session registration delay from backoff range.        delay := time.Duration(rand.Int63n(int64(backoff)))        session = newSession(ctx, a, delay)        registered = session.registered        sessionq = a.sessionq  

Look at session.start() this function again:

Start begins the session and returns the first Sessionmessage.func (S *session) Start (CTX context. Context) Error {log. G (CTX). DEBUGF ("(*session). Start") Client: = API.        Newdispatcherclient (s.agent.config.conn) description, err: = S.agent.config.executor.describe (CTX) if err! = Nil { Log. G (CTX). Witherror (ERR).            Withfield ("executor", S.agent.config.executor). Errorf ("Node description unavailable") Return err}//Override hostname if s.agent.config.hostname! = "" {description. Hostname = S.agent.config.hostname} Errchan: = Make (chan error, 1) var (msg*api. Sessionmessage Stream API. dispatcher_sessionclient)//note:we don ' t defer cancellation of this context, because the//streaming RPC is Used after this function returned.    We only Cancel//it in the timeout case to make sure the goroutine completes. Sessionctx, Cancelsession: = context. Withcancel (CTX)//need to run Session in a goroutine sInce there's no-to-set a//timeout for the individual Recv call in a stream. Go func () {stream, err = client. Session (Sessionctx, &api.            sessionrequest{Description:description,}) if err! = Nil {Errchan <-err return} msg, err = stream. RECV () Errchan <-Err} () Select {Case ERR: = <-errchan:if Err! = Nil {return E RR} case <-time. After (dispatcherrpctimeout): Cancelsession () return errors. New ("Session Initiation timed Out")} S.sessionid = Msg. SessionID s.session = stream return S.handlesessionmessage (CTX, MSG)}

(1)

    client := api.NewDispatcherClient(s.agent.config.Conn)    description, err := s.agent.config.Executor.Describe(ctx)    if err != nil {        log.G(ctx).WithError(err).WithField("executor", s.agent.config.Executor).            Errorf("node description unavailable")        return err    }    // Override hostname    if s.agent.config.Hostname != "" {        description.Hostname = s.agent.config.Hostname    }

api.NewDispatcherClient()the definition of the function and the type it returns is as follows:

    type dispatcherClient struct {        cc *grpc.ClientConn    }    func NewDispatcherClient(cc *grpc.ClientConn) DispatcherClient {        return &dispatcherClient{cc}    }

s.agent.config.Connis the Node.runAgent() direct connection previously obtained in the function through the following code manager GRPC :

conn, err := grpc.Dial(manager.Addr,        grpc.WithPicker(picker),        grpc.WithTransportCredentials(creds),        grpc.WithBackoffMaxDelay(maxSessionFailureBackoff))

s.agent.config.Executor.Describe()Returns a description of the current node (type: *api.NodeDescription ).
(2)

    errChan := make(chan error, 1)    var (        msg*api.SessionMessage        stream api.Dispatcher_SessionClient    )    // Note: we don't defer cancellation of this context, because the    // streaming RPC is used after this function returned. We only cancel    // it in the timeout case to make sure the goroutine completes.    sessionCtx, cancelSession := context.WithCancel(ctx)    // Need to run Session in a goroutine since there's no way to set a    // timeout for an individual Recv call in a stream.    go func() {        stream, err = client.Session(sessionCtx, &api.SessionRequest{            Description: description,        })        if err != nil {            errChan <- err            return        }        msg, err = stream.Recv()        errChan <- err    }()

And the dispatcherClient.Session() code is as follows:

func (c *dispatcherClient) Session(ctx context.Context, in *SessionRequest, opts ...grpc.CallOption) (Dispatcher_SessionClient, error) {    stream, err := grpc.NewClientStream(ctx, &_Dispatcher_serviceDesc.Streams[0], c.cc, "/docker.swarmkit.v1.Dispatcher/Session", opts...)    if err != nil {        return nil, err    }    x := &dispatcherSessionClient{stream}    if err := x.ClientStream.SendMsg(in); err != nil {        return nil, err    }    if err := x.ClientStream.CloseSend(); err != nil {        return nil, err    }    return x, nil}

Returns a Dispatcher_SessionClient interface variable of the type that matches:

type Dispatcher_SessionClient interface {    Recv() (*SessionMessage, error)    grpc.ClientStream}

grpc.NewClientStream()The function returns grpc.ClientStream interface , and is dispatcherSessionClient defined as follows:

type dispatcherSessionClient struct {    grpc.ClientStream}  

To satisfy the Dispatcher_SessionClient interface definition, the dispatcherSessionClient struct also implements the Recv method:

func (x *dispatcherSessionClient) Recv() (*SessionMessage, error) {    m := new(SessionMessage)    if err := x.ClientStream.RecvMsg(m); err != nil {        return nil, err    }    return m, nil}

x.ClientStream.SendMsg()is sent SessionRequest , and it contains only one of NodeDescription :

// SessionRequest starts a session.type SessionRequest struct {    Description *NodeDescription `protobuf:"bytes,1,opt,name=description" json:"description,omitempty"`}

x.ClientStream.CloseSend()Indicates that all the send operations have completed.
Following manager the message received, send err to errChan :

msg, err = stream.Recv()errChan <- err

(3)

    select {    case err := <-errChan:        if err != nil {            return err        }    case <-time.After(dispatcherRPCTimeout):        cancelSession()        return errors.New("session initiation timed out")    }    s.sessionID = msg.SessionID    s.session = stream    return s.handleSessionMessage(ctx, msg)

The goroutine initial blocking is select , once the correct response is received, session the initialization is completed. Then continue waiting for the task to be manager assigned.

Once session.start() successful, another one will be 3 launched goroutine :

go runctx(ctx, s.closed, s.errs, s.heartbeat)go runctx(ctx, s.closed, s.errs, s.watch)go runctx(ctx, s.closed, s.errs, s.listen)

session.heartbeat()A new variable is created dispatcherClient , and then the 1 request is sent after a second, and it api.HeartbeatRequest manager api.HeartbeatResponse is returned, telling how often it will agent heartbeat be sent, and the default time is now 5 seconds.

session.watch()A new variable is created dispatcherTasksClient , and then a api.TasksRequest request is sent to inform itself that it manager is already ready . Next, block the Recv() function and wait for the request to be manager sent task .

session.listen()Multiplexing session.session variables, blocking in Recv() functions, waiting to be manager sent SessionMessage , and then processing.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.