The Logstash is a lightweight Log collection processing framework that allows you to easily collect scattered, diverse logs and customize them for processing, and then transferring them to a specific location, such as a server or file.
The Logstash feature is very powerful. Starting with the Logstash 1.5.0 release, Logstash split all plugins into gem packages independently. In this way, each plug-in can be updated independently, without waiting for the logstash itself to do the overall update when it can be used. To achieve this goal, Logstash configured a dedicated Plugins management command
Logstash plug-in installation (local installation)
Logstash handles events in three stages: input---> Filter---> Output. Input generates an event, filter modifies the event, and outputs output to other places.
Filter is the Logstash Pipeline intermediate processing equipment. You can combine conditional statements to process events that conform to the criteria. This is only a plugin for filter:
You can see how many plugins are available from the Bin/plugin list now. (In fact, in the vendor/bundle/jruby/1.9/gems/directory)
Plug-in method for local installation
Bin/logstash-plugin install logstash-filter-kvignoring ffi-1.9.13 because its extensions is not built. Try:gem pristine ffi--version 1.9.13Validating logstash-filter-kvinstalling logstash-filter-kvinstallation Successful
Update plugin: Bin/logstash-plugin update logstash-filter-kv
Plugin Introduction
Here are just a few common
Grok : Parse and structure any text (not duplicated as described previously)
http://irow10.blog.51cto.com/2425361/1828077
GeoIP : Add geo-location information about the IP address.
GeoIP This plugin is very important and very common. He can analyze the address information of the visitor by analyzing the IP that is accessed. Examples are as follows:
Filter {geoip {Source = "message"}}
Message you enter an IP address, the result is as follows:
{ "message" => "183.60.92.253", "@version" => "1", "@timestamp" => "2016-07-07t10 : 32:55.610z ", " host " => " Raochenlindemacbook-air.local ", " GeoIP " => { "IP" => "183.60.92.253", "Country_code2" => "CN", "Country_code3" => "CHN", "Country_name" => "China", "Continent_code" => "as", "Region_name" = > ", " City _name " => " Guangzhou ", "Latitude" => 23.11670000000001, "Longitude" => 113.25, "timezone" => "Asia /chongqing ", " Real_region_name " => " Guangdong ", "Location" => [ [0] 113.25, [1] 23.11670000000001 ] }}
In practical application we can pass the REQUEST_IP obtained by Grok to GeoIP processing.
filter {if [type] = = "Apache" {grok {patterns_dir = "/usr/local/logstash-2.3.4/ownpatterns/patterns" Match + = {"Message" = "%{apache_log}"} Remove_field = ["Message"]} GE OIP {Source = "request_ip"}}
When the Logstash analyzes the data to the output stage and outputs it to other places, the data has the visitor's geographic information.
Date : Used to convert the time string in your log record
Date is a very common plug-in
filter { if [type] == "Apache" { grok { patterns_dir => "/usr/ Local/logstash-2.3.4/ownpatterns/patterns " match => { "Message" => "%{apache_log}" } remove_field => ["Message"] } date { match => [ "timestamp", "dd/ Mmm/yyyy:hh:mm:ss z " ]        }  }}
The timestamp in the Apache log is: [19/jul/2016:16:28:52 +0800]. Match's time format should correspond to matches in the log.
Apache_log%{iporhost:addre}%{user:ident}%{user:auth} \[%{httpdate:timestamp}\] \ "%{word:http_method}%{NOTSPACE: Request} http/%{number:httpversion}\ "%{number:status} (?:%{number:bytes}|-) \" (?:%{uri:http_referer}|-) \ "\"%{ Greedydata:user_agent}\ "
There are already [%{httpdate:timestamp}\] in the Apache log, why go through the date plugin to deal with it?
We take the time of the visit as the timestamp of the Logstash, and with this we can see how the request for parsing a certain period of time is based on time, and if there is no match to this time,Logstash will use the current time as the timestamp for that record. So you need to filter the format of the timestamp inside the definition. If date is not used, the time displayed in Elasticsearch is likely to be inconsistent with the time the log was generated.
useragent : Used to process browsers and operating systems used by analytics visitors
In the Apache log you will find a log: mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/45.0.2454.101 safari/537.36
We also use%{greedydata:user_agent} when splitting data with Grok.
Note: Greedydata This grok expression matches any type of data. Greedydata. *
Through this data, even if we show a little meaning, but we can dig it through useragent information
filter { if [type] == "Apache" { grok { patterns_dir => "/usr/local/logstash-2.3.4/ownpatterns/patterns" match => { "message" => "%{apache_log}" } remove_ field => ["Message"] } useragent { source => "User_ Agent " target => "UA" } date { match => [ "timestamp", "Dd/mmm/yyyy:hh:mm:ss z" ] } }}
Show Results:
650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/84/6F/wKioL1eQkAPzYJs8AAB7Xhc_X74146.png "title=" 9$ Vhry0iwca9zal) _rrexfm.png "alt=" Wkiol1eqkapzyjs8aab7xhc_x74146.png "/>
From we can see the visitor's browser and the operating system information. More meaningful than that big string of information.
Mutate : It provides a rich base type of data processing capability. Includes type conversions, string processing, field handling, and so on.
The types of conversions you can set include: "Integer", "float", and "string". Examples are as follows:
Filter {mutate {convert = = ["Request_time", "Float"]}}
Simple optimization of data
Logstash acquisition data and plugins such as date,geoip,useragent will make the information we get more informative, but also more bloated. All we have to do is kick off some meaningless data and simplify the data transfer to Elasticsearch.
Remove_field can accomplish this task very well. It's also useful on the top.
Remove_field = ["Message"]
In Grok we have sent a message into a lot of small pieces of data, if the message is transmitted to Elasticsearch is repeated.
Of course there are a lot of small data that we don't use or have no meaning. We can use Remove_field to clear it.
mutate{Remove_field = ["Syslog_timestamp"] Remove_field = ["Message"]}
Reference: https://zengjice.gitbooks.io/logstash-best-practice-cn/content/filter/mutate.html
Drop : Completely discard events, such as debug events
filter {if [loglevel] = = "Debug" {drop {}}}
Reference: https://www.elastic.co/guide/en/logstash/current/plugins-filters-drop.html
This article is from the "Tranquility Zhiyuan" blog, please be sure to keep this source http://irow10.blog.51cto.com/2425361/1828521
Log Analysis Logstash Plugin introduction