ELK classic usage-enterprise custom log collection cutting and mysql module, elkmysql
This article is included in the Linux O & M Enterprise Architecture Practice Series
1. Collect custom logs of cutting companies
The logs of many companies are not the same as the default log format of the service. Therefore, we need to cut the logs.
1. sample logs to be cut
11:19:23, 532 [143] DEBUG performanceTrace 1145 http://api.114995.com: 8082/api/Carpool/QueryMatchRoutes 183.205.134.240 null 972533 310000 860485038452951 TITTL00 HUAWEI 5.1 3.1.146 HUAWEI 113.552344 33.332737 send response completion Exception :( null)
2. Cutting Configuration
On logstash, use the grok plug-in of mongoter to cut
input { beats { port => "5044" }}filter { grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{NUMBER:thread:int}\] %{DATA:level} (?<logger>[a-zA-Z]+) %{NUMBER:executeTime:int} %{URI:url} %{IP:clientip} %{USERNAME:UserName} %{NUMBER:userid:int} %{NUMBER:AreaCode:int} (?<Board>[0-9a-zA-Z]+[-]?[0-9a-zA-Z]+) (?<Brand>[0-9a-zA-Z]+[-]?[0-9a-zA-Z]+) %{NUMBER:DeviceId:int} (?<TerminalSourceVersion>[0-9a-z\.]+) %{NUMBER:Sdk:float} %{NUMBER:Lng:float} %{NUMBER:Lat:float} (?<Exception>.*)" } remove_field => "message" } date { match => ["timestamp","dd/MMM/YYYY:H:m:s Z"] remove_field => "timestamp" } geoip { source => "clientip" target => "geoip" database => "/etc/logstash/maxmind/GeoLite2-City.mmdb" }}output { elasticsearch { hosts => ["http://192.168.10.101:9200/"] index => "logstash-%{+YYYY.MM.dd}" document_type => "apache_logs" }}
3. effect after cutting resolution
4. kibana Display Results
① Top10 clientip
② Top 5 URLs
③ Display geographic locations based on ip addresses
⑤ Top 10 executeTime
⑥ Other fields can be set, multiple patterns can also be put together for display
Ii. Detailed description of grok usage 1. Introduction
Grok is by far the best way to structure and query poor, unstructured logs. Grok performs well in parsing syslogs, apache and other webserver logs, mysql logs, and other files in any format.
Grok has more than 120 built-in Regular Expression Libraries. Address: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns.
2. Example
① Example
55.5.3.244.1 GET/index.html 15824 0.043
② Analysis
This log can be divided into five parts,IP address (55.3.244.1), method (GET), request file path (/index.html), number of bytes (15824), access duration (0.043), The parsing mode for this log (Regular Expression matching) is as follows:
% {IP: client }%{ WORD: method }%{ URIPATHPARAM: request }%{ NUMBER: bytes }%{ NUMBER: duration}
③ Write to filter
Filter {grok {match => {"message" => "% {IP: client }%{ WORD: method }%{ URIPATHPARAM: request }%{ NUMBER: bytes }%{ NUMBER: duration }"}}}
④ Resolution Effect
client: 55.3.244.1method: GETrequest: /index.htmlbytes: 15824duration: 0.043
3. parse logs in any format
(1) Steps for parsing logs of any format:
① Determine the log splitting principle first, that is, a log is divided into several parts.
② Analyze each piece. If the regular expression in the Grok meets the requirements, use it directly. If the Grok is useless, use the custom mode.
③ Learn to debug in the Grok Debugger.
(2) grok Classification
- Meet the built-in grok regular grok_pattern
① Yes
# Less/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.1/patterns/grok-patterns
② Use format
Grok_pattern consists of zero or multiple% {SYNTAX: SEMANTIC}Composition
Example: % {IP: clientip}
SYNTAX is the expression name provided by grok. For example, the numeric expression name is NUMBER, and the IP address expression name is IP.
SEMANTIC indicates the name of the parsed character, which is defined by itself. For example, the IP field name can be client
Format :(? <Field_name> the pattern here)
Example :(? <Board> [0-9a-zA-Z] + [-]? [0-9a-zA-Z] +)
(3) Regular Expression Parsing is prone to errors. It is strongly recommended that you use the Grok Debugger for debugging, as shown in the following figure (I cannot open this page)
3. Use the mysql module to collect mysql logs. 1. Official documentation
Https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-mysql.html
2. Configure filebeat and use the mysql module to collect mysql slow queries.
# Vim filebeat. yml
#=========================== Filebeat prospectors =============================filebeat.modules:- module: mysql error: enabled: true var.paths: ["/var/log/mariadb/mariadb.log"] slowlog: enabled: true var.paths: ["/var/log/mariadb/mysql-slow.log"]#----------------------------- Redis output --------------------------------output.redis: hosts: ["192.168.10.102"] password: "ilinux.io" key: "httpdlogs" datatype: "list" db: 0 timeout: 5
3. elk-logstash cut mysql slow query logs
① Cutting Configuration
# Vim mysqllogs. conf
input { redis { host => "192.168.10.102" port => "6379" password => "ilinux.io" data_type => "list" key => "httpdlogs" threads => 2 }}filter { grok { match => { "message" => "(?m)^#\s+User@Host:\s+%{USER:user}\[[^\]]+\]\s+@\s+(?:(?<clienthost>\S*) )?\[(?:%{IPV4:clientip})?\]\s+Id:\s+%{NUMBER:row_id:int}\n#\s+Query_time:\s+%{NUMBER:query_time:float}\s+Lock_time:\s+%{NUMBER:lock_time:float}\s+Rows_sent:\s+%{NUMBER:rows_sent:int}\s+Rows_examined:\s+%{NUMBER:rows_examined:int}\n\s*(?:use %{DATA:database};\s*\n)?SET\s+timestamp=%{NUMBER:timestamp};\n\s*(?<sql>(?<action>\w+)\b.*;)\s*(?:\n#\s+Time)?.*$" } } date { match => ["timestamp","dd/MMM/YYYY:H:m:s Z"] remove_field => "timestamp" }}output { elasticsearch { hosts => ["http://192.168.10.101:9200/"] index => "logstash-%{+YYYY.MM.dd}" document_type => "mysql_logs" }}
② Display results after cutting
4. kibana final display
① Which databases have the most, for example, top2?
The table cannot be displayed because some statements do not involve tables and cannot be cut out.
② Which SQL statements appear most frequently, for example, top 5 SQL statements?
③ Which SQL statements appear most frequently, for example, top 5 SQL statements?
④ Which servers generate the most slow query logs, for example, top 5 servers
⑤ Which users generate the most slow query logs, for example, top2 users?
Can be combined and displayed
5. Use the mysql module to collect mysql slow queries
(1) filebeat configuration is the same as above
(2) elk-logstash cut mysql error logs
# Vim mysqllogs. conf
filter { grok { match => { "message" => "(?<timestamp>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}) %{NUMBER:pid:int} \[%{DATA:level}\] (?<content>.*)" } } date { match => ["timestamp","dd/MMM/YYYY:H:m:s Z"] remove_field => "timestamp" }}
(3) The results will not be displayed.
4. ELK collects multi-instance logs
In many cases, the company has insufficient funds to collect logs one by one. Therefore, it is necessary for a logstash to use multiple instances to collect and process logs from multiple agents.
1. filebeat Configuration
It mainly refers to the output configuration. You only need to direct different agents to different ports.
① Agent 1 configuration points to port 5044
#----------------------------- Logstash output --------------------------------output.logstash: # The Logstash hosts hosts: ["192.168.10.107:5044"]
② Configure agent 2 to point to port 5045
#----------------------------- Logstash output --------------------------------output.logstash: # The Logstash hosts hosts: ["192.168.10.107:5045"]
2. logstash Configuration
For different agents, input specifies the corresponding port
① Agent 1
Input {beats {port => "5044"} output {# elasticsearch {hosts => ["http: // 192.168.10.107: 9200/"] index =>" logstash-apache1-% {+ YYYY. MM. dd} "document_type =>" apachew.logs "}}
② Agent 1
Input {beats {port => "5045"} output {# elasticsearch {hosts => ["http: // 192.168.10.107: 9200/"] index =>" logstash-apache2-% {+ YYYY. MM. dd} "document_type =>" apache2_logs "}}
You can enable the corresponding service.