This article is included in the Linux operation and Maintenance Enterprise Architecture Combat Series
I. Collect custom logs from the cutting company
Many companies ' journals are not consistent with the default log format for services, so we need to cut them.
1. Sample logs to be cut
2018-02-24 11:19:23,532 [143] DEBUG Performancetrace 1145 Http://api.114995.com:8082/api/Carpool/QueryMatchRoutes 183.205.134.240 null 972533 310000 TITTL00 Huawei 860485038452951 3.1.146 Huawei 5.1 113.552344 33.332737 Send response complete Exception :(null)
2, the cutting configuration
On Logstash, use the Fifter Grok plug-in for cutting
input {beats {port="5044"}}filter {grok {match= { "message"="%{timestamp_iso8601:timestamp} \[%{number:thread:int}\]%{data:level} (? <logger>[a-za-z]+)%{NUMBER: Executetime:int}%{uri:url}%{ip:clientip}%{username:username}%{number:userid:int}%{NUMBER:AreaCode:int} (?< Board>[0-9a-za-z]+[-]? [0-9a-za-z]+] (? <brand>[0-9a-za-z]+[-]?[ 0-9a-za-z]+)%{number:deviceid:int} (? <terminalsourceversion>[0-9a-z\.) +)%{number:sdk:float}%{number:lng:float}%{number:lat:float} (? <exception>.*)"} Remove_field="message"} date {match= ["timestamp","dd/mmm/yyyy:h:m:s Z"] Remove_field="timestamp"} geoip {Source="ClientIP"Target="GeoIP"Database="/etc/logstash/maxmind/geolite2-city.mmdb"}}output {elasticsearch {hosts= ["http://192.168.10.101:9200/"] Index="logstash-%{+yyyy. MM.DD}"Document_type="Apache_logs" }}
3. After cutting analysis effect
4, the final Kibana display effect
①TOP10 ClientIP
②top5 URL
③ Location based on IP
⑤TOP10 Executetime
⑥ other fields can be set, multiple patterns, or multiple graphs can be put together to show
ii. Explanation of Grok usage1. Introduction
Grok is by far the best way to make crappy, unstructured logs structured and queryable. Grok is perfect in parsing syslog logs, Apache and other webserver logs, MySQL logs and other files in any format.
Grok contains more than 120 kinds of regular expression libraries, address: Https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns.
2. Examples of getting Started
① Example
55.3.244.1 get/index.html 15824 0.043
② Analysis
This log can be divided into 5 parts,IP (55.3.244.1), Method (GET), request file path (/index.html), number of bytes (15824), length of Access (0.043), the parsing mode (regular expression matching) of this log is as follows:
%{ip:client}%{word:method}%{uripathparam:request}%{number:bytes}%{number:duration}
③ Write to filter
Filter {grok {match + = {"Message" = "%{ip:client}%{word:method}%{uripathparam:request}%{number:bytes}%{numbe R:duration} "}}}
④ Post-Parse effect
55.3. 244.1 /158240.043
3. Parse any format log
(1) Steps to parse any format log:
① first determine the log segmentation principle, that is, a log cut into several parts.
② analysis of each block, if the Grok to meet the demand, directly to use. If the grok is useless, use custom mode.
③ learn to debug in Grok Debugger .
(2) Classification of Grok
- Satisfies the Grok regular grok_pattern of the self-bring
① can query
# Less/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.1/patterns/grok-patterns
② using formats
Grok_pattern consists of 0 or more %{syntax:semantic}
Example:%{ip:clientip}
Where syntax is the name of the expression, which is provided by Grok: for example, the name of the numeric expression is the name of the NUMBER,IP address expression is the IP
SEMANTIC represents the name of the parsed character, defined by itself, such as the name of the IP field can be the client
Use format: (? <field_name>the pattern here)
Example: (? <board>[0-9a-za-z]+[-]?[ 0-9a-za-z]+)
(3) Regular parsing error prone, it is strongly recommended to use Grok debugger debugging, posture as follows (I open this page can not be used)
third, use MySQL module, collect MySQL log1. Introduction of Official Document usage
Https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-mysql.html
2, configure filebeat, use MySQL module to collect MySQL slow query
# Vim Filebeat.yml
#=========================== filebeat Prospectors =============================filebeat.modules:-module:mysql error: enabled:true var.paths: ["/var/log/mariadb/mariadb.log"] slowlog: enabled:true var.paths: ["/var/log/mariadb/mysql-slow.log"]#-----------------------------Redis output------------------ --------------Output.redis: hosts: ["192.168.10.102"] password: "Ilinux.io" key: "Httpdlogs" DataType: "List" db:0 Timeout:5
3, Elk-logstash cut MySQL slow query log
① Cutting Configuration
# Vim Mysqllogs.conf
Input {redis {host="192.168.10.102"Port="6379"Password="Ilinux.io"data_type="List"Key="Httpdlogs"Threads= 2}}filter {grok {match= = {"message"="(? m) ^#\[email Protected]:\s+%{user:user}\[[^\]]+\]\[email protected]\s+ (?:(? <clienthost>\s*)? \[(?:%{ipv4:clientip})? \]\s+id:\s+%{number:row_id:int}\n#\s+query_time:\s+%{number: Query_time:float}\s+lock_time:\s+%{number:lock_time:float}\s+rows_sent:\s+%{number:rows_sent:int}\s+rows_ examined:\s+%{number:rows_examined:int}\n\s* (?: Use%{data:database};\s*\n)? set\s+timestamp=%{number:timestamp};\n\s* (?<sql> (? <action>\w+) \b.*;) \s* (?: \ n#\s+time)?. *$"}} date {match= ["timestamp","dd/mmm/yyyy:h:m:s Z"] Remove_field="timestamp"}}output {elasticsearch {hosts= ["http://192.168.10.101:9200/"] Index="logstash-%{+yyyy. MM.DD}"Document_type="Mysql_logs" }}
② display results after cutting
4, Kibana final display effect
① which database is the most, example: TOP2 library
The table cannot be displayed because some statements do not involve tables and cannot be cut out
② which SQL statements appear the most, example: TOP5 SQL statements
③ which SQL statements appear the most, example: TOP5 SQL statements
④ which servers slow query log generation Most, example: TOP5 server
⑤ which few users slow query log generation Most, example: TOP2 user
You can merge the display
5. Use MySQL module to collect MySQL slow query
(1) filebeat configuration is the same as above
(2) Elk-logstash to cut MySQL error log
# Vim Mysqllogs.conf
Filter {grok {match= = {"message"="(? <timestamp>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})%{number:pid:int} \[%{data:level}\] (? <content >.*)"}} date {match= ["timestamp","dd/mmm/yyyy:h:m:s Z"] Remove_field="timestamp" }}
(3) It's not going to show the results.
Iv. ELK Collection of multi-instance logs
In many cases, the company is underfunded and does not collect logs from one to the other; therefore, it is necessary for a logstash to use multi-instance collection to process multiple agent logs.
1, the filebeat configuration
Mainly the output configuration, only the different agent points to different ports can be
①agent 1 configuration point to 5044 port
#-----------------------------Logstash output--------------------------------output.logstash: # The Logstash Hosts hosts: ["192.168.10.107:5044"]
②agent 2 configuration point to 5045 port
#-----------------------------Logstash output--------------------------------output.logstash: # The Logstash Hosts hosts: ["192.168.10.107:5045"]
2, the Logstash configuration
For different agents, input specifies the corresponding port
①agent 1
input {beats {port="5044"}}output {#can be differentiated in outputElasticsearch {hosts= ["http://192.168.10.107:9200/"] Index="logstash-apache1-%{+yyyy. MM.DD}"Document_type="Apache1_logs" }}
②agent 1
input {beats {port="5045"}}output {#can be differentiated in outputElasticsearch {hosts= ["http://192.168.10.107:9200/"] Index="logstash-apache2-%{+yyyy. MM.DD}"Document_type="Apache2_logs" }}
Open the corresponding service is OK.
ELK Classic usage-Enterprise custom log collection cutting and MySQL modules