Log Analysis Logstash Plugin introduction

Source: Internet
Author: User
Tags ffi geoip apache log logstash

The Logstash is a lightweight Log collection processing framework that allows you to easily collect scattered, diverse logs and customize them for processing, and then transferring them to a specific location, such as a server or file.

The Logstash feature is very powerful. Starting with the Logstash 1.5.0 release, Logstash split all plugins into gem packages independently. In this way, each plug-in can be updated independently, without waiting for the logstash itself to do the overall update when it can be used. To achieve this goal, Logstash configured a dedicated Plugins management command


Logstash plug-in installation (local installation)

Logstash handles events in three stages: input---> Filter---> Output. Input generates an event, filter modifies the event, and outputs output to other places.

Filter is the Logstash Pipeline intermediate processing equipment. You can combine conditional statements to process events that conform to the criteria. This is only a plugin for filter:

You can see how many plugins are available from the Bin/plugin list now. (In fact, in the vendor/bundle/jruby/1.9/gems/directory)

Plug-in method for local installation

Bin/logstash-plugin install logstash-filter-kvignoring ffi-1.9.13 because its extensions is not built. Try:gem pristine ffi--version 1.9.13Validating logstash-filter-kvinstalling logstash-filter-kvinstallation Successful

Update plugin: Bin/logstash-plugin update logstash-filter-kv


Plugin Introduction

Here are just a few common

Grok : Parse and structure any text (not duplicated as described previously)

http://irow10.blog.51cto.com/2425361/1828077


GeoIP : Add geo-location information about the IP address.

GeoIP This plugin is very important and very common. He can analyze the address information of the visitor by analyzing the IP that is accessed. Examples are as follows:

Filter {geoip {Source = "message"}}

Message you enter an IP address, the result is as follows:

{        "message"  =>  "183.60.92.253",        "@version"  =>  "1",     "@timestamp"  =>  "2016-07-07t10 : 32:55.610z ",          " host " => " Raochenlindemacbook-air.local ",         " GeoIP " => {                         "IP"  =>  "183.60.92.253",             "Country_code2"  =>  "CN",             "Country_code3"  =>  "CHN",              "Country_name"  =>  "China",            "Continent_code"  =>  "as",              "Region_name"  = >  ",               " City _name " => " Guangzhou ",                  "Latitude"  => 23.11670000000001,                 "Longitude"  => 113.25,                  "timezone"  =>  "Asia /chongqing ",        " Real_region_name " => " Guangdong ",                  "Location"  => [            [0] 113.25,             [1] 23.11670000000001         ]    }}

In practical application we can pass the REQUEST_IP obtained by Grok to GeoIP processing.

filter {if [type] = = "Apache" {grok {patterns_dir = "/usr/local/logstash-2.3.4/ownpatterns/patterns" Match + = {"Message" = "%{apache_log}"} Remove_field = ["Message"]} GE OIP {Source = "request_ip"}}

When the Logstash analyzes the data to the output stage and outputs it to other places, the data has the visitor's geographic information.


Date : Used to convert the time string in your log record

Date is a very common plug-in

filter {  if [type] ==  "Apache"  {     grok {      patterns_dir =>  "/usr/ Local/logstash-2.3.4/ownpatterns/patterns "      match => {                  "Message"  =>   "%{apache_log}"                  }    remove_field => ["Message"]    }    date {      match => [  "timestamp",  "dd/ Mmm/yyyy:hh:mm:ss z " ]        }  }} 

The timestamp in the Apache log is: [19/jul/2016:16:28:52 +0800]. Match's time format should correspond to matches in the log.

Apache_log%{iporhost:addre}%{user:ident}%{user:auth} \[%{httpdate:timestamp}\] \ "%{word:http_method}%{NOTSPACE: Request} http/%{number:httpversion}\ "%{number:status} (?:%{number:bytes}|-) \" (?:%{uri:http_referer}|-) \ "\"%{ Greedydata:user_agent}\ "

There are already [%{httpdate:timestamp}\] in the Apache log, why go through the date plugin to deal with it?

We take the time of the visit as the timestamp of the Logstash, and with this we can see how the request for parsing a certain period of time is based on time, and if there is no match to this time,Logstash will use the current time as the timestamp for that record. So you need to filter the format of the timestamp inside the definition. If date is not used, the time displayed in Elasticsearch is likely to be inconsistent with the time the log was generated.


useragent : Used to process browsers and operating systems used by analytics visitors

In the Apache log you will find a log: mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/45.0.2454.101 safari/537.36

We also use%{greedydata:user_agent} when splitting data with Grok.

Note: Greedydata This grok expression matches any type of data. Greedydata. *

Through this data, even if we show a little meaning, but we can dig it through useragent information

filter {  if [type] ==  "Apache"  {    grok {       patterns_dir =>  "/usr/local/logstash-2.3.4/ownpatterns/patterns"       match => {                  "message"  =>  "%{apache_log}"                  }    remove_ field => ["Message"]    }   useragent {                 source =>  "User_ Agent "                target  =>  "UA"         }   date {       match => [  "timestamp",  "Dd/mmm/yyyy:hh:mm:ss z"  ]         }  }}

Show Results:

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/84/6F/wKioL1eQkAPzYJs8AAB7Xhc_X74146.png "title=" 9$ Vhry0iwca9zal) _rrexfm.png "alt=" Wkiol1eqkapzyjs8aab7xhc_x74146.png "/>

From we can see the visitor's browser and the operating system information. More meaningful than that big string of information.


Mutate : It provides a rich base type of data processing capability. Includes type conversions, string processing, field handling, and so on.

The types of conversions you can set include: "Integer", "float", and "string". Examples are as follows:

Filter {mutate {convert = = ["Request_time", "Float"]}}

Simple optimization of data

Logstash acquisition data and plugins such as date,geoip,useragent will make the information we get more informative, but also more bloated. All we have to do is kick off some meaningless data and simplify the data transfer to Elasticsearch.

Remove_field can accomplish this task very well. It's also useful on the top.

Remove_field = ["Message"]

In Grok we have sent a message into a lot of small pieces of data, if the message is transmitted to Elasticsearch is repeated.

Of course there are a lot of small data that we don't use or have no meaning. We can use Remove_field to clear it.

mutate{Remove_field = ["Syslog_timestamp"] Remove_field = ["Message"]}

Reference: https://zengjice.gitbooks.io/logstash-best-practice-cn/content/filter/mutate.html


Drop : Completely discard events, such as debug events

filter {if [loglevel] = = "Debug" {drop {}}}

Reference: https://www.elastic.co/guide/en/logstash/current/plugins-filters-drop.html




This article is from the "Tranquility Zhiyuan" blog, please be sure to keep this source http://irow10.blog.51cto.com/2425361/1828521

Log Analysis Logstash Plugin introduction

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.