NSQ Source Read (iii) TCP Handler

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

TCP Handler

TCP handler handles each TCP connection

Type tcpserver struct {ctx *context}func (P *tcpserver) Handle (clientconn net. Conn) {P.CTX.NSQD.LOGF ("Tcp:new Client (%s)", CLIENTCONN.REMOTEADDR ())//The client should initialize itself by SE    Nding a 4 byte sequence indicating//the version of the Protocol that it intends to communicate, this would allow us To gracefully upgrade the protocol away from Text/line oriented to whatever ...//ZTD: Each time a client establishes a connection, the first message will be sent to the protocol version, from the generation Code//view, currently only supports V2 BUF: = Make ([]byte, 4) _, Err: = Io.        Readfull (Clientconn, buf) if err! = Nil {P.ctx.nsqd.logf ("error:failed to read protocol version-%s", err) return} protocolmagic: = String (BUF) P.ctx.nsqd.logf ("CLIENT (%s): Desired protocol Magic '%s '", Clie Ntconn.remoteaddr (), Protocolmagic) var prot protocol. Protocol switch Protocolmagic {case "V2": Prot = &protocolv2{ctx:p.ctx} default:protocol.s Endframedresponse (Clientconn, Frametypeerror, []byte ("E_bad_protOcol ")) Clientconn.close () P.ctx.nsqd.logf (" Error:client (%s) Bad protocol Magic '%s ' ", Clientcon N.remoteaddr (), protocolmagic) return}//ZTD: called the Ioloop function to handle the client connection Err = prot.        Ioloop (clientconn) if err! = Nil {P.ctx.nsqd.logf ("error:client (%s)-%s", clientconn.remoteaddr (), err) Return}}

Ioloop Handling of TCP connection:

Func (P *protocolv2) ioloop (conn net. Conn) Error {var err error var line []byte var zerotime time. Time ClientID: = Atomic. AddInt64 (&p.ctx.nsqd.clientidsequence, 1) Client: = NewClientV2 (ClientID, Conn, p.ctx)//Synchronize the Startu P of Messagepump in order//to guarantee the it gets a chance to initialize//Goroutine local state derived from  Client attributes//and avoid a potential race with IDENTIFY (where a client//could has changed or disabled said attributes) Messagepumpstartedchan: = Make (chan bool) go p.messagepump (client, Messagepumpstartedchan) <-mess Agepumpstartedchan for {if client. HeartbeatInterval > 0 {client. Setreaddeadline (time. Now (). ADD (client. HeartbeatInterval * 2)} else {client.  Setreaddeadline (Zerotime)}//Readslice does not allocate new space for the data per request//IE. The returned slice is a valid until the next call to it LiNE, err = client. Reader.readslice (' \ n ') if err! = Nil {if Err = = Io. EOF {err = nil} else {err = FMT. Errorf ("Failed to read command-%s", err)} break}//Trim the ' \ n ' line = Li            Ne[:len (line)-1]//Optionally trim the ' \ R ' If Len (line) > 0 && line[len (line)-1] = = ' \ r ' { line = Line[:len (line)-1]}//ZTD: The command divides the params with a space: = bytes. Split (line, separatorbytes) if P.ctx.nsqd.getopts (). Verbose {P.CTX.NSQD.LOGF ("PROTOCOL (V2): [%s]%s", client, params)} var response []byte/            /ZTD: Execute commands, different commands perform different functions, followed by a typical client discussion response, err = p.exec (client, params) if err! = Nil { CTX: = "" if parenterr: = Err. (Protocol. Childerr). Parent (); Parenterr! = Nil {ctx = "-" + Parenterr.error ()} p.ctx.nsqd.logf ("Error: [%s]-% s%s ", clIent, err, CTX) Senderr: = P.send (client, Frametypeerror, []byte (Err.                Error ())) if senderr! = nil {P.ctx.nsqd.logf ("ERROR: [%s]-%s%s", client, Senderr, CTX) Break}//errors of type Fatalclienterr should forceably close the connection I F _, OK: = Err. (*protocol. FATALCLIENTERR); OK {break} continue} if response! = Nil {err = P . Send (client, Frametyperesponse, response) if err! = Nil {err = FMT.    Errorf ("Failed to send response-%s", err) Break}}}//ZTD: Received EOF indicates that the client closed the connection P.ctx.nsqd.logf ("PROTOCOL (V2): [%s] exiting Ioloop", client) Conn. Close () Close (client. Exitchan) if client. Channel! = Nil {client. Channel.removeclient (client.id)} Return err}

Referring to the consumer example of the official website, wrote a simple client, the function of which is to subscribe to a topic and channel, when producer to the channel to send messages, the message printed on the screen. You want to further understand the server NSQD through the process of interaction. As follows

package mainimport (    "fmt"    nsq "github.com/nsqio/go-nsq")func main() {    config := nsq.NewConfig()    c, err := nsq.NewConsumer("nsq", "consumer", config)    if err != nil {        fmt.Println("Failed to init consumer: ", err.Error())        return    }    c.AddHandler(nsq.HandlerFunc(func(m *nsq.Message) error {        fmt.Println("received message: ", string(m.Body))        m.Finish()        return nil    }))    err = c.ConnectToNSQD("127.0.0.1:4150")    if err != nil {        fmt.Println("Failed to connect to nsqd: ", err.Error())        return    }    <-c.StopChan}

In the CONNECTTONSQD process, there are two steps to interacting with the server side. The first step:

resp, err := conn.Connect()

In the Connect section:

    conn, err := dialer.Dial("tcp", c.addr)    if err != nil {        return nil, err    }    c.conn = conn.(*net.TCPConn)    c.r = conn    c.w = conn    _, err = c.Write(MagicV2)

When the TCP connection is established, the protocol version number is sent to the server side, as we saw in TCP handler, and we receive a protocol version number each time the connection is established.
Second Step Interaction:

    cmd := Subscribe(r.topic, r.channel)    err = conn.WriteCommand(cmd)

The client sends a "SUB topic Channel" command to the server side.
Let's see how the server side handles this command.
In the method func (p *protocolV2) SUB(client *clientV2, params [][]byte) ([]byte, error) { , we skip a variety of checks:

    topic := p.ctx.nsqd.GetTopic(topicName)    channel := topic.GetChannel(channelName)    channel.AddClient(client.ID, client)    atomic.StoreInt32(&client.State, stateSubscribed)    client.Channel = channel    // update message pump    client.SubEventChan <- channel

In addition to adding a client to the channel and assigning a channel accident to the client, the message pump has been updated, so it's time to see what this message pump has done.

In TCP Handler, there is this code

    messagePumpStartedChan := make(chan bool)    go p.messagePump(client, messagePumpStartedChan)    <-messagePumpStartedChan

Enter into the messagepump of PROTOCAL_V2:

Func (P *protocolv2) messagepump (client *clientv2, Startedchan chan bool) {var err error var buf bytes. Buffer var Memorymsgchan chan *message var backendmsgchan chan []byte var subchannel *channel//Note: ' Flushe Rchan ' is used-bound message latency for//The pathological case of a channel on a low volume topic//With > 1 clients has >1 RDY counts var Flusherchan <-chan time. Time Var samplerate int32 Subeventchan: = client. Subeventchan Identifyeventchan: = client. Identifyeventchan Outputbufferticker: = time. Newticker (client. Outputbuffertimeout) Heartbeatticker: = time. Newticker (client. HeartbeatInterval) Heartbeatchan: = heartbeatticker.c msgtimeout: = client. Msgtimeout//V2 opportunistically buffers data to clients to reduce write system calls//we force flush in both CAs ES://1. When the client isn't ready to receive messages//2.     We ' re buffered and the channel have nothing left to send us//  (ie. we would block in this loop anyway)//flushed: = TRUE//signal to the goroutine that started the message Pump//That we ' ve started up close (Startedchan) for {if subchannel = Nil | |!client.            Isreadyformessages () {///The client is not ready to receive messages ... Memorymsgchan = nil            Backendmsgchan = Nil Flusherchan = nil//force flush Client.writeLock.Lock () Err = client. Flush () Client.writeLock.Unlock () if err! = Nil {goto exit} F Lushed = TRUE//ZTD: Once the subchannel is not nil, the Memorymsgchan and//backendmsgchan are assigned} else if            Flushed {//Last iteration we flushed ...//Don't select on the flusher ticker channel Memorymsgchan = Subchannel.memorymsgchan Backendmsgchan = SubChannel.backend.ReadChan () Flusherchan     = nil} else {       We ' re buffered (if there isn ' t any more data we should flush) ...//select on the Flusher ticker Chan            Nel, too Memorymsgchan = Subchannel.memorymsgchan Backendmsgchan = SubChannel.backend.ReadChan () Flusherchan = outputbufferticker.c} Select {case <-flusherchan://If this CA Se wins, we ' re either starved//or we won the race between other channels ...//In either case, for CE flush client.writeLock.Lock () Err = client. Flush () Client.writeLock.Unlock () if err! = Nil {goto exit} F Lushed = True Case <-client. Readystatechan://ZTD: The client in the sub function. Subeventchan <-Channel is//give this subchannel value case subchannel = <-subeventchan://CA N ' t SUB anymore Subeventchan = nil case Identifydata: = <-identifyeventchan://Can ' t IDENTIFY anymore Identifyeventchan = nil outputbufferticker.stop () if Identifydata. Outputbuffertimeout > 0 {outputbufferticker = time.            Newticker (Identifydata.outputbuffertimeout)} heartbeatticker.stop () Heartbeatchan = Nil If Identifydata.heartbeatinterval > 0 {heartbeatticker = time. Newticker (identifydata.heartbeatinterval) Heartbeatchan = heartbeatticker.c} if Iden Tifydata.samplerate > 0 {samplerate = identifydata.samplerate} msgtimeout = Iden            Tifydata.msgtimeout Case <-heartbeatchan:err = p.send (client, Frametyperesponse, heartbeatbytes) If err! = Nil {goto exit} case B: = <-backendmsgchan:if Samplera Te > 0 && rand. int31n (+) > samplerate {continue} msG, Err: = Decodemessage (b) if err! = Nil {P.ctx.nsqd.logf ("error:failed to decode message-%s ", err) Continue} msg. attempts++ subchannel.startinflighttimeout (msg, client.id, msgtimeout) client.             Sendingmessage () Err = p.sendmessage (client, MSG, &AMP;BUF) if err! = Nil {goto exit  } flushed = False case msg: = <-memorymsgchan:if samplerate > 0 && Rand. int31n > Samplerate {continue} msg. attempts++ subchannel.startinflighttimeout (msg, client.id, msgtimeout) client.             Sendingmessage () Err = p.sendmessage (client, MSG, &AMP;BUF) if err! = Nil {goto exit } flushed = False case <-client. Exitchan:goto exit}}exit:p.ctx.nsqd.logf ("PROTOCOL (V2): [%s] exiting meSsagepump ", client) Heartbeatticker.stop () outputbufferticker.stop () if err! = Nil {P.ctx.nsqd.logf (" PROT Ocol (V2): [%s] messagepump error-%s ", client, Err)}" in this code, once a client subscribes to a channel, it listens to Memorymsgchan, waits for producer to send Send the message over. Let's take a look at the pub process:
topic := p.ctx.nsqd.GetTopic(topicName)msg := NewMessage(topic.GenerateID(), messageBody)err = topic.PutMessage(msg)if err != nil {    return nil, protocol.NewFatalClientErr(err, "E_PUB_FAILED", "PUB failed "+err.Error())}
在做了一系列检查之后,向topic put 了一条message。 每次new 一个topic 的时候,会启动一个topic的 messagePump:
Topic constructorfunc newtopic (topicname string, ctx *context, deletecallback func (*topic)) *topic {t: = &topic{ Name:topicname, Channelmap:make (Map[string]*channel), Memorymsgchan:make (Chan *message, Ctx.nsqd.getOpts (). Memqueuesize), Exitchan:make (chan int), channelupdatechan:make (chan int), CTX:CTX, p Ausechan:make (chan bool), Deletecallback:deletecallback, Idfactory:newguidfactory (ctx.nsqd.get Opts (). ID),}if strings. Hassuffix (Topicname, "#ephemeral") {t.ephemeral = True t.backend = Newdummybackendqueue ()} else {t.backend = di Skqueue. New (Topicname, ctx.nsqd.getOpts (). DataPath, Ctx.nsqd.getOpts (). Maxbytesperfile, Int32 (Minvalidmsglength), Int32 (Ctx.nsqd.getOpts (). maxmsgsize) +minvalidmsglength, ctx.nsqd.getOpts (). Syncevery, Ctx.nsqd.getOpts (). Synctimeout, Ctx.nsqd.getOpts (). Logger)}t.waitgroup.wrap (func () {T.messagepump ()}) t.ctx.nsqd.notify (t) return T} 
在messagePump 中,会监听topic 的memoryMsgChan:
for {    select {    case msg = <-memoryMsgChan:
而每次收到一个消息,会向topic 下面所有的channel 进行广播:
    for i, channel := range chans {        chanMsg := msg        // copy the message because each channel        // needs a unique instance but...        // fastpath to avoid copy if its the first channel        // (the topic already created the first copy)        if i > 0 {            chanMsg = NewMessage(msg.ID, msg.Body)            chanMsg.Timestamp = msg.Timestamp            chanMsg.deferred = msg.deferred        }        if chanMsg.deferred != 0 {            channel.PutMessageDeferred(chanMsg, chanMsg.deferred)            continue        }        err := channel.PutMessage(chanMsg)        if err != nil {            t.ctx.nsqd.logf(                "TOPIC(%s) ERROR: failed to put msg(%s) to channel(%s) - %s",                t.name, msg.ID, channel.name, err)        }    }
如果对nsq 有了解的话,会知道每一个topic 会将一个msg 广播给所有的channel,这个逻辑的实现就在这块。channel 的PutMessage:
func (c *Channel) put(m *Message) error {    select {    case c.memoryMsgChan <- m:
将消息塞入了channel 的memoryMsgChan中。这时,代码又回到了protocal_v2 的 messagePump中:
    case msg := <-memoryMsgChan:        if sampleRate > 0 && rand.Int31n(100) > sampleRate {            continue        }        msg.Attempts++        // ztd: 出于可靠性的考虑,将消息发送到subscriber 并不真            // 正将消息删除,而是设置过期时间后,将消息缓存起来        subChannel.StartInFlightTimeout(msg, client.ID, msgTimeout)                client.SendingMessage()        // ztd: 向客户端发送消息        err = p.SendMessage(client, msg, &buf)        if err != nil {            goto exit        }        flushed = false
在客户端的代码中,我们注意有一行: `m.Finish()`,这行代码是告诉server 端我已经消费完这条信息了,可以丢弃了。这行代码向server 端发送一条`FIN` 命令。在server 端:
 // FinishMessage successfully discards an in-flight message func (c *Channel) FinishMessage(clientID int64, id MessageID) error {    msg, err := c.popInFlightMessage(clientID, id)    if err != nil {        return err    }    c.removeFromInFlightPQ(msg)    if c.e2eProcessingLatencyStream != nil {        c.e2eProcessingLatencyStream.Insert(msg.Timestamp)    }    return nil}
作者将这个msg 加到了两个queue,一个是message queue(数据结构其实是个map),另外一个是infligtPG,后面会讲到inflightPQ 的作用。如果没有及时Finish 消息,怎么处理timeout 的消息呢?在NSQD的 Main 函数中,启动了一个queueScanLoop:
n.waitGroup.Wrap(func() { n.queueScanLoop() })
在这个loop 中,设置了一个ticker,每过一段时间,就会执行resizePool:
func (n *NSQD) queueScanLoop() {workCh := make(chan *Channel, n.getOpts().QueueScanSelectionCount)responseCh := make(chan bool, n.getOpts().QueueScanSelectionCount)closeCh := make(chan int)workTicker := time.NewTicker(n.getOpts().QueueScanInterval)refreshTicker := time.NewTicker(n.getOpts().QueueScanRefreshInterval)channels := n.channels()n.resizePool(len(channels), workCh, responseCh, closeCh)for {    select {    case <-workTicker.C:        if len(channels) == 0 {            continue        }    case <-refreshTicker.C:        channels = n.channels()        n.resizePool(len(channels), workCh, responseCh, closeCh)        continue    case <-n.exitChan:        goto exit    }
resizePool 中,执行了这么个函数:
func (c *Channel) processInFlightQueue(t int64) bool {c.exitMutex.RLock()defer c.exitMutex.RUnlock()if c.Exiting() {    return false}dirty := falsefor {    c.inFlightMutex.Lock()    // ztd: 取出一个过期的    msg, _ := c.inFlightPQ.PeekAndShift(t)    c.inFlightMutex.Unlock()    if msg == nil {        goto exit    }    dirty = true    _, err := c.popInFlightMessage(msg.clientID, msg.ID)    if err != nil {        goto exit    }    atomic.AddUint64(&c.timeoutCount, 1)    c.RLock()    client, ok := c.clients[msg.clientID]    c.RUnlock()    if ok {        client.TimedOutMessage()    }    c.put(msg)}exit:return dirty

}

这个函数不断从inflightPqueue 中取出一个过期的,从inflightMsgQueue 中删除。PeekAndShift:
func (pq *inFlightPqueue) PeekAndShift(max int64) (*Message, int64) {if len(*pq) == 0 {    return nil, 0}x := (*pq)[0]if x.pri > max {    return nil, x.pri - max}pq.Pop()return x, 0

}

inflightPqueue 的数据结构是一个最小堆,每次push 一条新的消息:
func (pq *inFlightPqueue) up(j int) {for {    i := (j - 1) / 2 // parent    if i == j || (*pq)[j].pri >= (*pq)[i].pri {        break    }    pq.Swap(i, j)    j = i}}
所以,堆顶的消息是最近一个过期的消息,如果最近一条过期的消息都还没有过期,那就没有过期的消息。如果有过期的,就pop 出来。这样在for 循环中不断把过期消息pop 出来,直到没有过期的消息。
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.