Based on Filebeat two times development Kubernetes log Collection

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

At present the most mainstream container orchestration tool mainly has kubernetes, Mesos, Swarm, the individual does not evaluate who is good who is bad because each thing has its own superiority. However, personally think that the current highest concern should be kubernetes, now more and more companies use kubernetes as the bottom-level orchestration tool to develop their own container scheduling platform. Since it is a PAAs platform, it should provide a computing monitoring and other services as a whole, because it is in the kubernetes run above the container is mostly stateless services, so unified log management is an essential part of it. Let's talk about how to develop your own log collection based on Filebeat.

Currently using the most log management technology should be elk,e there should be not much doubt that many companies are using this as the storage index engine. L and Logstash is a log capture tool to support file acquisition and other ways, but the container-based log collection and traditional file mining method slightly different, although Docker itself provides some log driver but still can not meet our needs well. Now Kubernetes official has a log solution based on FLUENTD. As for why the last choice to use filebeat and no use fluentd mainly have a few points:

    • First Filebeat is go write, I am go development, Fluentd is written by Ruby I'm sorry, I can't read it.
    • Filbeat relatively light weight, filbeat now although the function is relatively simple but already basically enough, and hit the mirror only dozens of M
    • Filbeat performance is good, there is no specific compared with FLUENTD, before compared with Logstash is indeed better than Logstash, Logtash is also written by Ruby I think it would be better than fluentd
    • Filbeat is simple in function, but the code structure is very easy to customize development
    • And even though it's been a long time fluentd, Fluentd's profile is really hard for me to understand.

Filebeat how to collect kubernetes logs

Therefore, based on the above decision to use Filebeat developed its own log collection.
Filebeat's github address contains https://github.com/elastic/beats several projects including Filebeat.

and other log capture processing as filebeat There are several parts are input, processors, output, but Filebeat provides less capacity, but it does not matter enough good.

Filebeat provides a add_kubernetes_metadata processor, the file acquisition path will be paired with the /var/lib/docker/containers/*/*-json.log main is to monitor the kubernetes apiserver the container corresponding to the pod information stored in memory, from the file log source ( That is the path above) inside gets the container ID matches the information of the pod.
Because the log in the Json.log file is in JSON format, it is necessary to format the log JSON, Filebeat has a processor called decode_json_fields these processor all support conditional judgment, It is possible to determine by condition whether a log is to be processed. Filebeat The default log field is a message but * * Json.log after parsing out the log field is log, if the other logs are configured at the same time to use the storage log field is different, so they need to be processed so that they use the same field, but Filebeat did not provide this function so that he wrote a add_fields The function.

The following configuration files are organized:

filebeat.prospectors:-type:log paths:-/var/lib/docker/containers/*/*-json.log-/VAR/LOG/CONTAINERS/APPLOGS/*PR ocessors:-add_kubernetes_metadata:in_cluster:false Host: "127.0.0.1" Kube_config:/root/.kube/config-add_fiel    Ds:fields:log: ' {message} '-Decode_json_fields:when:regexp:log: ' {*} ' fields: [' Log '] Overwrite_keys:true target: ""-drop_fields:fields: ["source", "beat.version", "Beat.name", "message"]-Parse_lev El:levels: ["Fatal", "error", "Warn", "info", "Debug"] field: "Log" Logging.level:infosetup.template.enabled:tru Esetup.template.name: "Filebeat-%{+yyyy. MM.DD} "Setup.template.pattern:" filebeat-* "#setup. Template.fields:" ${path.config}/fields.yml " Setup.template.fields: "/fields.yml" Setup.template.overwrite:truesetup.template.settings:index:analysis:an Alyzer:enncloud_analyzer:filter: ["standard", "lowercase", "Stop"] char_filter: ["My_filter"           ]Type:custom Tokenizer:standard char_filter:my_filter:type:mapping Mapping S: ["-=>_"]output:elasticsearch:hosts: ["127.0.0.1:9200"] Index: "filebeat-%{+yyyy. MM.DD} "

If the online environment Filebeat is also in the way of Daemonset run in the kubernetes cluster, so in_cluster it needs to be set to True, the corresponding kube_config does not need to configure, the host parameter is listening to a node of the pod, So this value should be the name of the pod where the filebeat is running, and of course it can not be written, so that is to listen to the global pod, but this is not necessary for filebeat is not good.

add_fieldsProcessor can add the fields you want, either a string or a {message} format, and if this is the format, it will be populated with values from existing fields.

parse_levelProcessor is a feature that is used for a matching log format, and if the log file appears at the top of the logging level, the log is added to a corresponding level of field.

Filebeat also has the ability to specify the mapping used for the function of the template processing.

Development Filebeat Processor

The

Use of the process is mainly for some unsatisfied processor development, Filebeat code structure is very clear abstraction is very good, can be very simple to develop. The processor functionality of the
Filebeat is mainly placed in Libbeat and filbeat siblings, and is called processors in this directory. You can see actions in it, add_cloud_metadata , add_kubernetes_metadata , add_ Docker_metadata so filebeat only supports direct Docker processor, More common processor are placed below the actions so if we need to develop some simple processor words can be placed directly below, including Decode_json and drop_event are also placed below. Take Add_field as an example:

Package Actionsimport ("FMT" "RegExp" "Strings" "Github.com/elastic/beats/libbeat/beat" "Github.com/elasti C/beats/libbeat/common "" github.com/elastic/beats/libbeat/processors ") type AddFields struct {fields Map[string]stri NG Reg *regexp. Regexp}func Init () {processors. Registerplugin ("Add_fields", configchecked (Newaddfields, Requirefields ("Fields"), Allowedfiel DS ("Fields", "when"))}func Newaddfields (c *common. Config) (processors. Processor, error) {config: = struct {map[string]string ' config: ' Fields '}} ERR: = C.unpack (&amp ; config) if err! = Nil {return nil, fmt. Errorf ("Fail to unpack the Add_fields configuration:%s", err)} f: = &addfields{fields:config. Fields, Reg:regexp. Mustcompile ("{(. *)}")} return F, Nil}func (f *addfields) Run (event *beat. Event) (*beat. Event, error) {var errors []string for field, value: = Range F.fields {matchers: = f.reg.findallstringsUbmatch (value,-1) If Len (matchers) = = 0 {event. Putvalue (field, value)} else {if Len (Matchers[0]) >= 2 {val, err: = event. GetValue (Strings. Trim (matchers[0][1], "")) if err! = Nil {errors = append (errors, err. Error ())} else {event. Putvalue (field, Val)}}}} return event, Nil}func (f *addfields) string () string {var fields []string-field, _: = Range F.fields {fields = append (fields, field)} return "Add_fie Lds= "+ strings. Join (Fields, ",")}

You need to define your own struct, and the Newaddfields method initializes its own struct through the configuration file. and register his processor in Init through Registerplugin. This struct is mainly to implement the Run method, this method is for each log event specific processing.

To this is basically achieved docking kubernetes the docking transformation is basically completed, of course, there are many other work can be done, such as Golang itself, the regex and Encoding/json performance is poor, these are can be optimized place.

My own fork out the address is https://github.com/yiqinguo/beats increased makefile directly compiled to hit the Mirror, and filebeat-ds.yml directly into the kubernetes cluster inside.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.