K8s and monitoring--the realization of the buffer of transforming Telegraf

Source: Internet
Author: User
Tags k8s redis cluster
This is a creation in Article, where the information may have evolved or changed.

The buffer realization of transforming TELEGRAF

Objective

Recently, in a scenario that uses TELEGRAF, the data is not lost when the program terminates unexpectedly. According to the original implementation of TELEGRAF, two buffer was maintained inside the running_output, respectively metrics and failmetrics. These two buffer are based on the channel in Go. Because there is no persistence mechanism, there is a risk of data loss at the time of an unexpected exit. So this article focuses on some of the previous TELEGRAF to ensure data security and some of our optimizations for code.

Telegraf approach to data security

About two buffer, defined in the struct of the running_output.go.

// RunningOutput contains the output configurationtype RunningOutput struct {    Name              string    Output            telegraf.Output    Config            *OutputConfig    MetricBufferLimit int    MetricBatchSize   int    MetricsFiltered selfstat.Stat    MetricsWritten  selfstat.Stat    BufferSize      selfstat.Stat    BufferLimit     selfstat.Stat    WriteTime       selfstat.Stat    metrics     *buffer.Buffer    failMetrics *buffer.Buffer    // Guards against concurrent calls to the Output as described in #3009    sync.Mutex}

The size of this two buffer provides the configuration parameters that can be set.

metrics:           buffer.NewBuffer(batchSize),failMetrics:       buffer.NewBuffer(bufferLimit),

Name implies. The metrics stores the metric to be sent to the specified output, and Failmetrics stores the metric that failed to send. Of course the failed metrics will be sent again under the TELEGRAF mechanism.

    if ro.metrics.Len() == ro.MetricBatchSize {        batch := ro.metrics.Batch(ro.MetricBatchSize)        err := ro.write(batch)        if err != nil {            ro.failMetrics.Add(batch...)        }    }

When adding metrics to metrics, do you want to reach the number of batches sent and call the Send method if it is reached. Of course there are timing solutions, and if you have not reached Metricbatchsize, you will send the data after a certain amount of time. Specific implementation code in AGENT.GO

  Ticker: = time. Newticker (a.config.agent.flushinterval.duration) Semaphore: = Make (Chan struct{}, 1) for {select {CA Se <-shutdown:log. Println ("i! Hang on, flushing any cached metrics before shutdown ")//wait for OUTMETRICC to get flushed before Flushing ou Tputs WG. Wait () A.flush () return nil case <-ticker. C:go func () {Select {case semaphore <-struct{}{}: Inter Nal.                Randomsleep (a.config.agent.flushjitter.duration, Shutdown) A.flush () <-semaphore Default://Skipping this flush because one is already happening log. PRINTLN ("W!                Skipping a scheduled flush because there is "+" already a flush ongoing. ") }            }()

After the program receives the stop signal, the program first flush the remaining data into output and then exits the process. This guarantees a certain amount of data security.

The persistence of buffer based on Redis

In the selection of persistence mechanism, Redis is preferred. Redis has high performance and is fully durable.
The specific implementation architecture is as follows:

Abstracts the functions in the original buffer from the Buffer.go interface.
Specific code:

package bufferimport (    "github.com/influxdata/telegraf"    "github.com/influxdata/telegraf/internal/buffer/memory"    "github.com/influxdata/telegraf/internal/buffer/redis")const (    BufferTypeForMemory = "memory"    BufferTypeForRedis  = "redis")type Buffer interface {    IsEmpty() bool    Len() int    Add(metrics ...telegraf.Metric)    Batch(batchSize int) []telegraf.Metric}func NewBuffer(mod string, size int, key, addr string) Buffer {    switch mod {    case BufferTypeForRedis:        return redis.NewBuffer(size, key, addr)    default:        return memory.NewBuffer(size)    }}

The buffer interface is then implemented separately in memory and Redis.
Where Newbuffer is equivalent to a factory method.
Of course, the latter can be implemented based on buffer implementations such as file and DB to meet different scenarios and requirements.

Key points for Redis to implement buffer

In order to meet the requirements of FIFO, the list data structure of Redis is selected. The list in Redis is a list of strings, so the metric data interface in TELEGRAF conforms to the requirements of serialization. For example, attributes need to be exportable, that is, public. So this needs to change the definition of TELEGRAF for metric struct. Alternatively, you can choose a JSON or Msgpack serialization method. We are using JSON serialization in this way.

Conclusion

After the transformation, you can decide to use channel or Redis to implement buffer according to your own needs through the configuration file. Each has the merits and demerits, the memory realizes, the performance is high, receives the dependence to be few. Redis, the distributed storage, determines the data security, but the performance will be a certain loss, after all, there is a large number of serialization and deserialization and network transmission, of course, the dependency also increased, depending on the reliability of Redis, recommended Redis cluster deployment.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.