K8s and auditing--increase clickhouse to heapster sink

Source: Internet
Author: User
Tags sprintf influxdb statsd k8s opentsdb stackdriver librato

Objective

In the K8S resource audit and billing this piece, the container and the virtual machine have very big difference. The container is not easy to implement relative to the virtual machine.
Resource metrics can be collected using Heapster or Prometheus. Before the article has introduced, Prometheus Storage bottleneck and query large data volume, easy to oom these two problems. So I chose the heapster. In addition, Heapster not only internally implemented a lot of aggregator and calculator, and did a lot of work on the aggregation layer. With Prometheus, you need to do aggregations at query time.
The heapster supports many metrics outputs, called sink. Currently supported sink such as:

And I prefer to clickhouse database, about Clickhouse, in fact, the previous article introduced a lot.
So this paper mainly talk about how to increase clickhouse sink for Heapster.

Code Analysis and implementation

Looking at the code, adding a sink is still very simple. A typical factory design pattern, the implementation of Name,stop,exportdata interface method can be. Finally, an initialization function is provided for factory invocation.

Initialization Method Newclickhousesink

Specific code:

config, err := clickhouse_common.BuildConfig(uri)    if err != nil {        return nil, err    }    client, err := sql.Open("clickhouse", config.DSN)    if err != nil {        glog.Errorf("connecting to clickhouse: %v", err)        return nil, err    }    sink := &clickhouseSink{        c:       *config,        client:  client,        conChan: make(chan struct{}, config.Concurrency),    }    glog.Infof("created clickhouse sink with options: host:%s user:%s db:%s", config.Host, config.UserName, config.Database)    return sink, nil

Basically is to get the configuration file, initialize the Clickhouse client.

In the build method in Factory.go, add the initialization function that you just implemented

Func (this *sinkfactory) Build (URI flags. Uri) (core. Datasink, error) {switch URI. Key {case "Elasticsearch": Return Elasticsearch. Newelasticsearchsink (&uri. Val) Case "GCM": Return GCM. Creategcmsink (&uri. Val) Case "Stackdriver": Return stackdriver. Createstackdriversink (&uri. Val) Case "STATSD": Return STATSD. Newstatsdsink (&uri. Val) Case "graphite": return graphite. Newgraphitesink (&uri. Val) Case "Hawkular": Return hawkular. Newhawkularsink (&uri. Val) Case "Influxdb": Return INFLUXDB. Createinfluxdbsink (&uri. Val) Case "Kafka": Return Kafka. Newkafkasink (&uri. Val) Case "Librato": Return Librato. Createlibratosink (&uri. Val) Case "log": Return Logsink. Newlogsink (), Nil case "metric": Return Metricsink. Newmetricsink (140*time. Second, 15*time. Minute, []string{core. MetricCpuUsageRate.MetricDescriptor.Name, Core. Metricmemoryusage.meTricdescriptor.name}), nil case "Opentsdb": Return OPENTSDB. Createopentsdbsink (&uri. Val) Case "Wavefront": Return wavefront. Newwavefrontsink (&uri. Val) Case "Riemann": Return Riemann. Createriemannsink (&uri. Val) Case "honeycomb": return honeycomb. Newhoneycombsink (&uri. Val) Case "Clickhouse": Return clickhouse. Newclickhousesink (&uri. Val) Default:return Nil, FMT. Errorf ("Sink not recognized:%s", Uri.) Key)}}

Name and Stop

func (sink *clickhouseSink) Name() string {    return "clickhouse"}func (tsdbSink *clickhouseSink) Stop() {    // Do nothing}

The Stop function is called when Heapster is closed and performs some unmanaged resource shutdowns.

ExportData

This is the core of the place.

Func (sink *clickhousesink) exportdata (Databatch *core. Databatch) {sink. Lock () defer sink. Unlock () If Err: = Sink.client.Ping (); Err! = Nil {glog.  WARNINGF ("Failed to ping Clickhouse:%v", err) return} datapoints: = Do ([]point, 0, 0) for _, Metricset  : = Range Databatch.metricsets {for metricname, Metricvalue: = Range Metricset.metricvalues {var value Float64 if core. ValueInt64 = = Metricvalue.valuetype {value = Float64 (metricvalue.intvalue)} else if core.                Valuefloat = = Metricvalue.valuetype {value = Float64 (metricvalue.floatvalue)} else { Continue} pt: = point{name:metricname, cluster:sink.c.clu Stername, Val:value, Ts:dataBatch.Timestamp,} for key, Val UE: = Range Metricset.labels {If _, exists: = ClickhouseblacklistLabels[key];!exists {if Value! = "" {if key = = "Labels" { LBS: = Strings. Split (Value, ",") for _, LB: = range lbs {ts: = strings.                                    Split (LB, ":") If Len (ts) = = 2 && ts[0]! = "" && ts[1]! = "" { Pt.tags = Append (Pt.tags, FMT. Sprintf ("%s=%s", Ts[0], ts[1])}}} els e {pt.tags = append (Pt.tags, FMT. Sprintf ("%s=%s", Key, Value)}}}} Datap oints = Append (datapoints, PT) If Len (datapoints) >= sink.c.batchsize {Sink.concurrentsendda        Ta (datapoints) datapoints = make ([]point, 0, 0)}} If Len (datapoints) >= 0 { Sink.concUrrentsenddata (datapoints)} sink.wg.Wait ()} 

There are several places to be aware of:

    • Format conversion of the data. You need to convert databatch in Heapster to the format you want to store. In fact, this one has done pipeline more output, it is easy to understand.
    • Bulk write. In general, bulk writing is an effective means of large data volumes.
    • The concurrent write destination is stored according to the setup parameters. The Golang is used in the process of the association. The following code implements a process for sending data.
func (sink *clickhouseSink) concurrentSendData(dataPoints []point) {    sink.wg.Add(1)    // use the channel to block until there's less than the maximum number of concurrent requests running    sink.conChan <- struct{}{}    go func(dataPoints []point) {        sink.sendData(dataPoints)    }(dataPoints)}

Get configuration parameters

In Clickhouse.go, the main thing is to get the configuration parameters and parameters to initialize some default values, as well as the work of verifying the configuration parameters.

Changes to Dockerfile

The original base image is based on the scratch

FROM scratchCOPY heapster eventer /COPY ca-certificates.crt /etc/ssl/certs/#   nobody:nobodyUSER 65534:65534ENTRYPOINT ["/heapster"]

Due to the need to change the timezone problem, changed to be based on Alpine.

FROM alpineRUN apk add -U tzdataRUN ln -sf /usr/share/zoneinfo/Asia/Shanghai  /etc/localtimeCOPY heapster eventer /COPY ca-certificates.crt /etc/ssl/certs/RUN chmod +x /heapsterENTRYPOINT ["/heapster"]

In fact, it is possible to add timezone and change based on scratch, just to load some package instructions, and the result is that the mirror becomes larger. Rather than this, it's better to be based on the Alpine I'm more familiar with.

Summarize

The project address of the fork. Actual Run log:

Thanks to the excellent write performance of CK, the operation is very stable.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.