NSQ Source Learning

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

NSQ Source Learning

Brief introduction

NSQ is a distributed queue implemented with the go language. Read the source of the Chanel of the Go language, distributed has a better understanding

Code structure

The core code is divided into 3 parts:

    • NSQD: Queue data store
    • Nsqlookup: Managing NSQD nodes, service discovery
    • Visualization of the NSQADMIN:NSQ

Nsqd

The official introduction is

NSQD is the daemon so receives, queues, and delivers messages to clients.

It can be run standalone normally configured in a cluster with NSQLOOKUPD instance (s) (in which case it'll announ Ce topics and channels for discovery).

It listens on a TCP ports, one for clients and another for the HTTP API. It can optionally listen on a third port for HTTPS.

The main idea: NSQD is the daemon that receives, distributes queue information. Generally clustered, or can be deployed on its own.

Here is a study of the 2 logic of NSQD

    1. Start logic
    2. Data storage

Start logic

In Makefile, write to

$(BLDDIR)/nsqd:        $(wildcard apps/nsqd/*.go       nsqd/*.go       nsq/*.go internal/*/*.go)

You can find the code entry for NSQD inapps/nsqd/nsqd.go

Apps/nsqd/nsqd.go

This file as a program entry, mainly done a few things:

    • Receive command line arguments
    • Create a new NSQD structure based on command line arguments
    • Start NSQD

First, the author uses the SVC package to control the startup of the program:

type program struct {    nsqd *nsqd.NSQD}func main() {    prg := &program{}    if err := svc.Run(prg, syscall.SIGINT, syscall.SIGTERM); err != nil {        log.Fatal(err)    }}func (p *program) Init(env svc.Environment) error {...}func (p *program) Start() error {...}func (p *program) Stop() error {...}

The use of SVC can be more concise to ensure that the program clean exit. In Nsqd, there are two exit signals: SIGINT (enter any health) and SIGTERM (Kill).

The Start () function is the main logical entry, referencing Newoptions () in the function, which creates a default options structure. Options follow-up as a parameter source for NSQD startup

opts := nsqd.NewOptions()

The author implements command-line parameter reception through the flag package, and if the configuration file is executed at the command line, the configuration file is also read. Create a NSQD structure based on the configuration file, command-line arguments

options.Resolve(opts, flagSet, cfg)nsqd := nsqd.New(opts)

The data is then loaded

err := nsqd.LoadMetadata()err = nsqd.PersistMetadata()

The Loadmetadata () process is:

    1. First use the Atomic library and lock
    2. Read the file with the node ID, as well as the default file, compare the two, and get the data from the file
    3. Parse the data json out of the meta structure
    4. Traverse Meta, get topic name and Chanel name, pause operation on Topic/chanel that need to be paused

The Persistmetadata () process is:

    1. Obtaining corresponding topic and channel according to the NSQD structure
    2. Persisting topic and channel to a file

Next, the main logic nsqd that initiates NSQD is called. Main () to complete the following procedure

    1. Monitor TCP port, HTTP port, HTTPS port according to options parameter
    2. Launch 4 Goroutines, respectively, to start HTTP Api,queuescanloop,lookuploop,statsdloop
    n.waitGroup.Wrap(func() {        http_api.Serve(n.httpListener, httpServer, "HTTP", n.logf)    })    n.waitGroup.Wrap(func() { n.queueScanLoop() })    n.waitGroup.Wrap(func() { n.lookupLoop() })    if n.getOpts().StatsdAddress != "" {        n.waitGroup.Wrap(func() { n.statsdLoop() })    }

The Waitgroup is used here, which is a Groutines control pack that can be launched into Python-like join () functions. All Groutines can be implemented and then exited.

The author encapsulates the Waitgroup library

type WaitGroupWrapper struct {    sync.WaitGroup}func (w *WaitGroupWrapper) Wrap(cb func()) {    w.Add(1)    go func() {        cb()        w.Done()    }()}

The Add () counter plus 1,done () causes the counter to be reduced by one. Additionally Waitgroup provides the wait () function: When the counter is 0 o'clock, continue execution, otherwise block. Waits for the thread to finish executing the re-exit function.

In addition, the function is used as an argument and then executed internally groutines, similar to the use of Python's adorner.

Back to the main () function, start HTTP_API using the github.com/nsqio/nsq/internal/http_api package, set router parameters, and so on.

Queuescanloop () is a pipeline sweep process whose logic is to read the data in the Tpic,channel into the worker channel and update the number of workers at a certain time to scan the data in the Chanel.

select {        case <-workTicker.C:            if len(channels) == 0 {                continue            }        case <-refreshTicker.C:            channels = n.channels()            n.resizePool(len(channels), workCh, responseCh, closeCh)            continue        case <-n.exitChan:            goto exit        }

In this case, select is used to listen for IO operations, and at every scan interval, the data in the channel is determined to be processed, and if not, the scan is skipped.

Determine whether the number of workers changes at each refresh interval.

loop:        numDirty := 0        for i := 0; i < num; i++ {            if <-responseCh {                numDirty++            }        }        if float64(numDirty)/float64(num) > n.getOpts().QueueScanDirtyPercent {            goto loop        }

There is also the concept of the dirty ratio, where data in the channel is considered to be dirty, and when the ratio exceeds the value in the configuration, the calling worker is processed instead of waiting for a fixed interval to be scanned.

Start Lookuploop () and Statsdloop (); The roles of these two functions are initially seen and nsqdlookup communicated with, and the details are not yet known.

The start-up logic of NSQD is described above. NSQD using HTTP APIs and user interaction

Data storage

In the API documentation, you see the pub Interface for publishing information:

Using the example
curl -d "<message>" http://127.0.0.1:4151/pub?topic=name

In Nsqd/http.go, routing rules are defined

func newHTTPServer(ctx *context, tlsEnabled bool, tlsRequired bool) *httpServer {    ...    s := &httpServer{        ctx:         ctx,        tlsEnabled:  tlsEnabled,        tlsRequired: tlsRequired,        router:      router,    }    router.Handle("POST", "/pub", http_api.Decorate(s.doPUB, http_api.V1))    ...}

In the Dopub () function, when you see the data store, you finally call OPIc. Putmessage (msg)

err = topic.PutMessage(msg)
func (t *Topic) PutMessage(m *Message) error {    t.RLock()    defer t.RUnlock()    if atomic.LoadInt32(&t.exitFlag) == 1 {        return errors.New("exiting")    }    err := t.put(m)    if err != nil {        return err    }    atomic.AddUint64(&t.messageCount, 1)    return nil}

The logic of putmessage is to do concurrency control (locking), Topic.put (*message) to write the information.

Here are two lock control mechanisms:

    1. Rlock
    2. Atomic

Rloclk

In the Go language, the Sync pack has two types of locks, the mutex sync, respectively. Mutex and read-write lock Sync.rwmutex.

type Mutex    func (m *Mutex) Lock()    func (m *Mutex) Unlock()type RWMutex    func (rw *RWMutex) Lock()    func (rw *RWMutex) RLock()    func (rw *RWMutex) RLocker() Locker    func (rw *RWMutex) RUnlock()    func (rw *RWMutex) Unlock()

Mutexes tend to be used globally, and once locked, they must be unlocked before they can be accessed. No two locking, two times unlock will be error.

Read and write locks are used in reading far more than written scenes.

Lock () indicates that the write lock, before the write lock, if there are already write locks, or other read locks, will block until the lock is available. The blocked lock call excludes the new reader from the acquired lock, that is, the write lock permission is higher than the read lock, and the write lock is preferred when there is a write lock.

Rlock () is read lock, when there is a write lock, cannot load read lock, when there is only read lock or no lock, can load read lock, read lock can load more, so it is suitable for "read more write less" scenario.

For specific examples of read and write locks, refer to sync in Golang. Rwmutex and Sync.mutex differences

Atomic

Atomic is another locking mechanism in the sync package, which is lower than the mutex level in implementation: The mutex invokes the Golang API, and Atomic is implemented at the kernel level. Therefore, it is more efficient than the mutex, but there are some restrictions on its use. If you use a storage-related interface, the deposit is nil, or the wrong type, will be error.

In addition, in some articles, as well as in the stack overflow are mentioned to minimize the use of atomic, the specific reason is not known.

There are several common functions of atomic:

    1. CAS: Compare and store, if it is equal to the old value, write the new value to
    2. Increase or decrease
    3. Read or write

Specific View Atomic Introduction

In the above putmessage logic, the put () function is called to implement the write after an atomic operation lock is added to the topic read lock and partial values in topic.

func (t *Topic) put(m *Message) error {    select {    case t.memoryMsgChan <- m:    default:        b := bufferPoolGet()        err := writeMessageToBackend(b, m, t.backend)        bufferPoolPut(b)        t.ctx.nsqd.SetHealth(err)        if err != nil {            t.ctx.nsqd.logf(LOG_ERROR,                "TOPIC(%s) ERROR: failed to write message to backend - %s",                t.name, err)            return err        }    }    return nil}

The action of the put function is to write the message to the channel, and if the topic's Memorymsgchan length is full, it is written to buffer through the default logic.

The implementation of buffer is the use of sync. The pool package, which is equivalent to a cache, is released before the GC and the length of the storage is limited by the memory size.

Here are two questions:

    1. Deposit Memorymsgchan Even if you finish topic writing?
    2. What about data in buffer

After finding, it is found that the function that handles the two channel is Messagepump, and Messagepump is called in the background when creating a new topic:

func NewTopic(topicName string, ctx *context, deleteCallback func(*Topic)) *Topic {    ...    t.waitGroup.Wrap(func() {t.messagePump()})    ...}
func (t *Topic) messagePump() {    ...    if len(chans) > 0 {        memoryMsgChan = t.memoryMsgChan        backendChan = t.backend.ReadChan()    }    select {        case msg = <-memoryMsgChan:        case buf = <-backendChan:            msg, err = decodeMessage(buf)            if err != nil {                t.ctx.nsqd.logf(LOG_ERROR, "failed to decode message - %s", err)                continue            }            ...    }    ...    for i, channel := range chans {        chanMsg := msg         if i > 0 {                chanMsg = NewMessage(msg.ID, msg.Body)                chanMsg.Timestamp = msg.Timestamp                chanMsg.deferred = msg.deferred            }                ...        err := channel.PutMessage(chanMsg)        ...    }    ...}

The above call to channel Putmessage () completes the message written to the channel Memorymsgchan, and the write logic is similar to the write topic logic. This completes the data writing process analysis.

Nsqlookup

The official introduction is as follows

NSQLOOKUPD is the daemon that manages topology information. Clients query NSQLOOKUPD to discover NSQD producers for a specific topic and NSQD nodes broadcasts topic and channel Infor Mation.

There is interfaces:a TCP interface which was used by NSQD for broadcasts and a HTTP interface for clients to Perfor M discovery and administrative actions.

The main idea is: Nsqlookup is the Guardian process to manage the NSQD of the cluster extension information. Nsqlookup for

    1. For client queries, for specific topic and channel
    2. The NSQD node broadcasts its own information to Nsqloookup.

Here's a look at the two logic of Nsqllookup:

    1. For clients to query specific topic data
    2. Receive NSQD broadcasts.

Query topic and Channel

According to the process of querying data, NSQ provides several well-packaged query interfaces, if Nsq_tail, Nsq_to_file, and so on. See here for an example from Nsq_til.

The main logic in Nsq_tail is as follows:

consumers := []*nsq.Consumer{}    for i := 0; i < len(topics); i += 1 {        fmt.Printf("Adding consumer for topic: %s\n", topics[i])        consumer, err := nsq.NewConsumer(topics[i], *channel, cfg)        if err != nil {            log.Fatal(err)        }        consumer.AddHandler(&TailHandler{topicName: topics[i], totalMessages: *totalMessages})        err = consumer.ConnectToNSQDs(nsqdTCPAddrs)        if err != nil {            log.Fatal(err)        }        err = consumer.ConnectToNSQLookupds(lookupdHTTPAddrs)        if err != nil {            log.Fatal(err)        }        consumers = append(consumers, consumer)    }

The logic of Nsq_tail is to initialize a consumer consumer, respectively, for each topic, where the consumer implementation of the library is go-nsq.
and implement a nsq_tail logic of the handler, initialized in consumer.

The data is then fetched from NSQD and Nsqdlookup, and the handler processing is called.

In Go-nsq/consumer.go, CONNECTTONSQLOOKUPD () calls QUERYLOOKUPD () and Lookupdloop (), and Lookupdloop () periodically calls QUERYLOOKUPD (). The code is as follows:

Func (R *consumer) connecttonsqlookupd (addr string) error {... if numlookupd = = 1 {r.querylookupd () R.wg.add (1) Go R.lookupdloop ()} ...} Func (R *consumer) Lookupdloop () {... for {select {case <-ticker. C:R.QUERYLOOKUPD () Case &LT;-R.LOOKUPDRECHECKCHAN:R.QUERYLOOKUPD () case &LT;-R.EXITC Han:goto exit}} ...} Make a HTTP req to one of the configured NSQLOOKUPD instances to discover//which NSQD ' s provide the topic we is cons uming.////initiate a connection to any new producers that is Identified.func (R *consumer) querylookupd () {... va R Nsqdaddrs []string for _, Producer: = Range data. Producers {broadcastaddress: = producer. BroadcastAddress Port: = producer. TCPPort Joined: = Net. Joinhostport (BroadcastAddress, StrConv. Itoa (port)) Nsqdaddrs = append (Nsqdaddrs, joined)}//Apply filter if discoveryfilter, OK: = R.bEhaviordelegate. (Discoveryfilter); OK {Nsqdaddrs = Discoveryfilter.filter (Nsqdaddrs)} for _, Addr: = Range Nsqdaddrs {err = R.CONNECTT ONSQD (addr) If err! = Nil && Err! = erralreadyconnected {r.log (Loglevelerror, "(%s) error Conne Cting to Nsqd-%s ", addr, err) Continue}}}

In Querylookupd (), after obtaining the producer information, call CONNECTTONSQD () to connect each NSQD server. Read message is implemented with CONNECTTONSQD ().

CONNECTYPNSQD () called the Function Readloop () of the connection result.

func (c *Conn) readLoop() {    for {        ...        frameType, data, err :=      ReadUnpackedResponse(c)        ...        switch frameType {        case FrameTypeResponse:            c.delegate.OnResponse(c, data)        case FrameTypeMessage:            msg, err := DecodeMessage(data)            if err != nil {                c.log(LogLevelError, "IO error - %s", err)                c.delegate.OnIOError(c, err)                goto exit            }            msg.Delegate = delegate            msg.NSQDAddress = c.String()            atomic.AddInt64(&c.rdyCount, -1)            atomic.AddInt64(&c.messagesInFlight, 1)            atomic.StoreInt64(&c.lastMsgTimestamp, time.Now().UnixNano())            c.delegate.OnMessage(c, msg)            ...    }}

In C.delegate.onmessage (c, msg), the message is written to Consumer.incomingmessages. Complete the data read.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.