ELK classic usage-enterprise custom log collection cutting and mysql module, elkmysql

Source: Internet
Author: User
Tags geoip kibana logstash filebeat

ELK classic usage-enterprise custom log collection cutting and mysql module, elkmysql

This article is included in the Linux O & M Enterprise Architecture Practice Series

1. Collect custom logs of cutting companies

The logs of many companies are not the same as the default log format of the service. Therefore, we need to cut the logs.

1. sample logs to be cut

11:19:23, 532 [143] DEBUG performanceTrace 1145 http://api.114995.com: 8082/api/Carpool/QueryMatchRoutes 183.205.134.240 null 972533 310000 860485038452951 TITTL00 HUAWEI 5.1 3.1.146 HUAWEI 113.552344 33.332737 send response completion Exception :( null)

2. Cutting Configuration

On logstash, use the grok plug-in of mongoter to cut

input {        beats {                port => "5044"        }}filter {    grok {        match => {            "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{NUMBER:thread:int}\] %{DATA:level} (?<logger>[a-zA-Z]+) %{NUMBER:executeTime:int} %{URI:url} %{IP:clientip} %{USERNAME:UserName} %{NUMBER:userid:int} %{NUMBER:AreaCode:int} (?<Board>[0-9a-zA-Z]+[-]?[0-9a-zA-Z]+) (?<Brand>[0-9a-zA-Z]+[-]?[0-9a-zA-Z]+) %{NUMBER:DeviceId:int} (?<TerminalSourceVersion>[0-9a-z\.]+) %{NUMBER:Sdk:float} %{NUMBER:Lng:float} %{NUMBER:Lat:float} (?<Exception>.*)"        }        remove_field => "message"    }    date {                   match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]        remove_field => "timestamp"           }    geoip {        source => "clientip"        target => "geoip"        database => "/etc/logstash/maxmind/GeoLite2-City.mmdb"    }}output {    elasticsearch {        hosts => ["http://192.168.10.101:9200/"]        index => "logstash-%{+YYYY.MM.dd}"        document_type => "apache_logs"    }}

 

3. effect after cutting resolution

4. kibana Display Results

① Top10 clientip

② Top 5 URLs

③ Display geographic locations based on ip addresses

⑤ Top 10 executeTime

⑥ Other fields can be set, multiple patterns can also be put together for display

Ii. Detailed description of grok usage 1. Introduction

Grok is by far the best way to structure and query poor, unstructured logs. Grok performs well in parsing syslogs, apache and other webserver logs, mysql logs, and other files in any format.

Grok has more than 120 built-in Regular Expression Libraries. Address: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns.

2. Example

① Example

55.5.3.244.1 GET/index.html 15824 0.043

② Analysis

This log can be divided into five parts,IP address (55.3.244.1), method (GET), request file path (/index.html), number of bytes (15824), access duration (0.043), The parsing mode for this log (Regular Expression matching) is as follows:

% {IP: client }%{ WORD: method }%{ URIPATHPARAM: request }%{ NUMBER: bytes }%{ NUMBER: duration}

③ Write to filter

Filter {grok {match => {"message" => "% {IP: client }%{ WORD: method }%{ URIPATHPARAM: request }%{ NUMBER: bytes }%{ NUMBER: duration }"}}}

④ Resolution Effect

client: 55.3.244.1method: GETrequest: /index.htmlbytes: 15824duration: 0.043

 

3. parse logs in any format

(1) Steps for parsing logs of any format:

① Determine the log splitting principle first, that is, a log is divided into several parts.

② Analyze each piece. If the regular expression in the Grok meets the requirements, use it directly. If the Grok is useless, use the custom mode.

③ Learn to debug in the Grok Debugger.

(2) grok Classification

  • Meet the built-in grok regular grok_pattern

① Yes

# Less/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.1/patterns/grok-patterns

② Use format

Grok_pattern consists of zero or multiple% {SYNTAX: SEMANTIC}Composition

Example: % {IP: clientip}

SYNTAX is the expression name provided by grok. For example, the numeric expression name is NUMBER, and the IP address expression name is IP.

SEMANTIC indicates the name of the parsed character, which is defined by itself. For example, the IP field name can be client

  • Custom SYNTAX

Format :(? <Field_name> the pattern here)

Example :(? <Board> [0-9a-zA-Z] + [-]? [0-9a-zA-Z] +)

(3) Regular Expression Parsing is prone to errors. It is strongly recommended that you use the Grok Debugger for debugging, as shown in the following figure (I cannot open this page)

3. Use the mysql module to collect mysql logs. 1. Official documentation

Https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-mysql.html

2. Configure filebeat and use the mysql module to collect mysql slow queries.

# Vim filebeat. yml

#=========================== Filebeat prospectors =============================filebeat.modules:- module: mysql  error:    enabled: true    var.paths: ["/var/log/mariadb/mariadb.log"]  slowlog:    enabled: true    var.paths: ["/var/log/mariadb/mysql-slow.log"]#----------------------------- Redis output --------------------------------output.redis:  hosts: ["192.168.10.102"]  password: "ilinux.io"  key: "httpdlogs"  datatype: "list"  db: 0  timeout: 5

 

3. elk-logstash cut mysql slow query logs

① Cutting Configuration

# Vim mysqllogs. conf

input {        redis {                host => "192.168.10.102"                port => "6379"                password => "ilinux.io"                data_type => "list"                key => "httpdlogs"                threads => 2        }}filter {        grok {                match => { "message" => "(?m)^#\s+User@Host:\s+%{USER:user}\[[^\]]+\]\s+@\s+(?:(?<clienthost>\S*) )?\[(?:%{IPV4:clientip})?\]\s+Id:\s+%{NUMBER:row_id:int}\n#\s+Query_time:\s+%{NUMBER:query_time:float}\s+Lock_time:\s+%{NUMBER:lock_time:float}\s+Rows_sent:\s+%{NUMBER:rows_sent:int}\s+Rows_examined:\s+%{NUMBER:rows_examined:int}\n\s*(?:use %{DATA:database};\s*\n)?SET\s+timestamp=%{NUMBER:timestamp};\n\s*(?<sql>(?<action>\w+)\b.*;)\s*(?:\n#\s+Time)?.*$" }        }        date {                match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]                remove_field => "timestamp"        }}output {        elasticsearch {                hosts => ["http://192.168.10.101:9200/"]                index => "logstash-%{+YYYY.MM.dd}"                document_type => "mysql_logs"        }} 

② Display results after cutting

4. kibana final display

① Which databases have the most, for example, top2?

The table cannot be displayed because some statements do not involve tables and cannot be cut out.

② Which SQL statements appear most frequently, for example, top 5 SQL statements?

③ Which SQL statements appear most frequently, for example, top 5 SQL statements?

④ Which servers generate the most slow query logs, for example, top 5 servers

⑤ Which users generate the most slow query logs, for example, top2 users?

Can be combined and displayed

5. Use the mysql module to collect mysql slow queries

(1) filebeat configuration is the same as above

(2) elk-logstash cut mysql error logs

# Vim mysqllogs. conf

filter {        grok {                match => { "message" => "(?<timestamp>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}) %{NUMBER:pid:int} \[%{DATA:level}\] (?<content>.*)" }        }        date {                match => ["timestamp","dd/MMM/YYYY:H:m:s Z"]                remove_field => "timestamp"        }}

 

(3) The results will not be displayed.

4. ELK collects multi-instance logs

In many cases, the company has insufficient funds to collect logs one by one. Therefore, it is necessary for a logstash to use multiple instances to collect and process logs from multiple agents.

1. filebeat Configuration

It mainly refers to the output configuration. You only need to direct different agents to different ports.

① Agent 1 configuration points to port 5044

#----------------------------- Logstash output --------------------------------output.logstash:  # The Logstash hosts  hosts: ["192.168.10.107:5044"]

② Configure agent 2 to point to port 5045

#----------------------------- Logstash output --------------------------------output.logstash:  # The Logstash hosts  hosts: ["192.168.10.107:5045"]

 

2. logstash Configuration

For different agents, input specifies the corresponding port

① Agent 1

Input {beats {port => "5044"} output {# elasticsearch {hosts => ["http: // 192.168.10.107: 9200/"] index =>" logstash-apache1-% {+ YYYY. MM. dd} "document_type =>" apachew.logs "}}

② Agent 1

Input {beats {port => "5045"} output {# elasticsearch {hosts => ["http: // 192.168.10.107: 9200/"] index =>" logstash-apache2-% {+ YYYY. MM. dd} "document_type =>" apache2_logs "}}

You can enable the corresponding service.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.