Distributed Real-time log processing platform elk

Source: Internet
Author: User
Tags kibana logstash

These three functions are log collection, index and search, and visualized display.

 

L logstash

This architecture diagram shows that logstash is only the place where collect and index are located. A. conf file is input during runtime, And the configuration is divided into three parts: input, filter, and output.

L redis

Redis serves as a decoupling between log collection and indexing.

L elasticsearch

Core Component used for searching. Main features: Real-Time, distributed, highly available, document oriented, schema free, restful

 

Kibana

Visualized log components make Data Interaction easier

 

Deploy required components
  • Logstash https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
  • Redis http://download.redis.io/releases/redis-stable.tar.gz.
  • Elasticsearch https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.2.zip
  • Kibana https://github.com/elasticsearch/kibana

 

 

 

Logstash

Logstash 10-minute Tutorial: http://logstash.net/docs/1.4.2/tutorials/10-minute-walkthrough/

 

Download and decompress the latest logstash version

 

 

 

Edit the logstash. conf configuration file

 

Logstash User Manual: http://logstash.net/docs/1.4.2/

Log4j server configuration instance: log4j. conf

Input {

Log4j {

Data_timeout => 5

# Mode => "server"

# Port => 4560

}

}

 

Filter {

JSON {

Source => "message"

Remove_field => ["message", "class", "file", "host", "method", "path", "Priority", "Thread", "type ", "logger_name"]

}

}

 

Output {

# Stdout {codec => JSON}

Redis {

Host => "redis. internal.173"

Port = & gt; 6379

Data_type => "list"

Key => "soalog"

}

}

 

 

 

 

Logstash output elasticsearch configuration instance: soalog-es.conf

Input {

Redis {

Host => "redis. internal.173"

Port => "6379"

Key => "soalog"

Data_type => "list"

}

}

 

Filter {

JSON {

Source => "message"

Remove_field => ["message", "type"]

}

}

 

Output {

Elasticsearch {

# Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173"

Cluster => "soaes"

Index => "soa_logs-% {+ YYYY. Mm. dd }"

}

}

 

 

Here, the filter configuration source => message is to parse the JSON string in the message as an index field, and then configure remove_field to delete unnecessary fields.

 

Start

./Logstash-F soalog-es.conf -- verbose-l ../soalog-es.log &

./Logstash-F log4j. conf -- verbose-l ../log4j. log &

 

 

 

Elastcisearch

 

Download the latest elasticsearch version and decompress it.

 

Bin/elasticsearch-D backend operation

 

Verify

 

Elasticsearch cluster configuration:

Edit config/elasticsearch. yml

# Specify the cluster name. elasticsearch is used by default.

Cluster. Name: soaes

# Specify the data storage directory, which can be multiple disks/path/to/data1,/path/to/data2

Path. Data:/mnt/hadoop/esdata

# Specify the log storage directory

Path. logs:/mnt/hadoop/eslogs

# Cluster master node list, execute to discover new nodes

Discovery. Zen. Ping. unicast. hosts: ["hadoop74", "hadoop75"]

 

Configure the es template to specify whether the field is indexed and the storage type.

Create the templates directory under the config directory

Add template file template-soalogs.json

{

"Template-soalogs ":{

"Template": "soa_logs *",

"Settings ":{

"Index. number_of_shards": 5,

"Number_of_replicas": 1,

"Index ":{

"Store ":{

"Compress ":{

"Stored": True,

"TV": True

}

}

}

},

"Mappings ":{

"Logs ":{

"Properties ":{

"Providernode ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Servicemethod ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Appid ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Status ":{

"Type": "long"

},

"Srcappid ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Remark ":{

"Type": "string"

},

"Serviceversion ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Srcserviceversion ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Logside ":{

"Type": "long"

},

"Invoketime ":{

"Type": "long"

},

"@ Version ":{

"Type": "string"

},

"@ Timestamp ":{

"Format": "dateoptionaltime ",

"Type": "date"

},

"Srcserviceinterface ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Serviceinterface ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Retrycount ":{

"Type": "long"

},

"Traceid ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Processtime ":{

"Type": "long"

},

"Consumernode ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Rpcid ":{

"Index": "not_analyzed ",

"Type": "string"

},

"Srcservicemethod ":{

"Index": "not_analyzed ",

"Type": "string"

}

}

}

}

}

}

 

 

Kibana

Go to the elasticsearch directory

Bin/Plugin-install elasticsearch/kibana
Verification: http: // localhost: 9200/_ plugin/kibana

Kibana needs to configure query index rules

 

 

Here the index is soa_logs, which needs to be specified as YYYY-MM-DD by the talent index format

 

 

 

Logstash time difference: 8 hours

 

When logstash is output to elasticsearch every day, because the time zone uses UTC, the index of the day is created at every day, and the data before is output to the index of the previous day.

You can modify logstash/lib/logstash/event. RB to solve this problem.

226th rows

. Withzone (Org. joda. Time. datetimezone: UTC)

Change

. Withzone (Org. joda. Time. datetimezone. getdefault ())

 

 

Log4j. properties configuration

# Remote logging

Log4j. additi.pdf. logstash = false

Log4j. Logger. logstash = info, logstash

Log4j. appender. logstash = org.apache.log4j.net. socketappender

Log4j. appender. logstash. remotehost = localhost

Log4j. appender. logstash. Port = 4560

Log4j. appender. logstash. locationinfo = false

 

 

Java log output

Private Static finalorg. slf4j. Logger logstash = org. slf4j. loggerfactory. getlogger ("logstash ");

Logstash.info (jsonobject. tojsonstring (rpclog ));

 

 

Kopf

Elasticsearch Cluster Monitoring

Bin/Plugin-installlmenezes/elasticsearch-Kopf

Http: // localhost: 9200/_ plugin/Kopf

 

 

 


Example of logstash Tomcat access logs:

Configure tomcat. conf on the logstash agent

Input {

File {

Type => "USAP"

Path => ["/opt/17173/Apache-Tomcat-7.0.50-8090/logs/Catalina. out ","/opt/17173/Apache-Tomcat-7.0.50-8088/logs/Catalina. out ","/opt/17173/Apache-Tomcat-7.0.50-8086/logs/Catalina. out ","/opt/

17173/Apache-Tomcat-7.0.50-8085/logs/Catalina. Out ","/opt/17173/apache-tomcat-6.0.37-usap-image/logs/Catalina. Out "]

Codec => multiline {

Pattern => "(^. + exception :. +) | (^ \ s +. +) | (^ \ s +... \ D + more) | (^ \ s * caused :. + )"

What => "previous"

}

}

}

Filter {

Grok {

# Match => {"message" => "% {combinedapachelog }"}

Match => ["message", "% {tomcatlog}", "message", "% {catalinalog}"]

Remove_field => ["message"]

}

}

Output {

# Stdout {codec => rubydebug}

Redis {Host => "redis. internal.173" data_type => "list" Key => "USAP "}

}

 

 

Modify logstash/patterns/grok-patterns

Added Tomcat LOG filter Regular Expressions

# Tomcat log

Javaclass (? : [A-zA-Z0-9-] + \ :) + [A-Za-z0-9 $] +

Javalogmessage (.*)

Thread [A-Za-z0-9 \-\ [\] +

# Mmm DD, yyyy hh: mm: SS eg: Jan 9, 2014 7:13:13 AM

Catalina_datestamp % {month }%{ monthday}, 20% {year }%{ hour }:? % {Minute }(? ::? % {Second })(? : Am | pm)

# Yyyy-mm-dd hh: mm: SS, sss zzz eg: 17:32:25, 527-0800

Tomcat_datestamp 20% {year}-% {monthnum}-% {monthday }%{ hour }:? % {Minute }(? ::? % {Second}) % {iso8601_timezone}

Log_time % {hour }:? % {Minute }(? ::? % {Second })

Catalinalog % {catalina_datestamp: Timestamp }%{ javaclass: Class }%{ javalogmessage: logmessage}

#11:27:51, 786 [http-bio-8088-exec-4] Debug jsonrpcserver: 504-invoking method: gethistory

# Tomcatlog % {log_time: Timestamp }%{ thread: thread }%{ loglevel: Level }%{ javaclass: Class }-% {javalogmessage: logmessage}

Tomcatlog % {tomcat_datestamp: Timestamp }%{ loglevel: Level }%{ javaclass: Class }-% {javalogmessage: logmessage}

 

Start Tomcat log agent:

./Logstash-F tomcat. conf -- verbose-l ../tomcat. log &

 

Tomcat logs are stored in ES

Configure tomcat-es.conf

Input {

Redis {

Host => 'redis. internal.173'

Data_type => 'LIST'

Port => "6379"

Key => 'usap'

# Type => 'redis-input'

# Codec => JSON

}

}

Output {

# Stdout {codec => rubydebug}

Elasticsearch {

# Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173"

Cluster => "soaes"

Index => "USAP-% {+ YYYY. Mm. dd }"

}

}

 

Start Tomcat log storage

./Logstash-F tomcat-es.conf -- verbose-l ../tomcat-es.log &

 

 

Example of logstash accessing nginx \ syslog log

Configure nginx. conf on the logstash agent

Input {

File {

Type => "Linux-syslog"

Path => ["/var/log/*. log", "/var/log/messages"]

}

File {

Type => "nginx-access"

Path => "/usr/local/nginx/logs/access. log"

}

File {

Type => "nginx-error"

Path => "/usr/local/nginx/logs/error. log"

}

}

Output {

# Stdout {codec => rubydebug}

Redis {Host => "redis. internal.173" data_type => "list" Key => "nginx "}

}

 

Start nginx log agent

./Logstash-F nginx. conf -- verbose-l ../nginx. log &

 

Store nginx logs to es

Configure nginx-es.conf

Input {

Redis {

Host => 'redis. internal.173'

Data_type => 'LIST'

Port => "6379"

Key => 'nginx'

# Type => 'redis-input'

# Codec => JSON

}

}

Filter {

Grok {

Type => "Linux-syslog"

Pattern => "% {syslogline }"

}

Grok {

Type => "nginx-access"

Pattern => "% {iporhost: source_ip}-% {Username: remote_user} \ [% {httpdate: Timestamp} \] % {iporhost: Host} % {QS: request }%{ INT: Status }%{ INT: body_bytes_sent }%{ QS: http_refere

R }%{ QS: http_user_agent }"

}

}

Output {

# Stdout {codec => rubydebug}

Elasticsearch {

# Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173"

Cluster => "soaes"

Index => "nginx-% {+ YYYY. Mm. dd }"

}

}

 

Start nginx log storage

./Logstash-F nginx-es.conf -- verbose-l ../nginx-es.log &

 

 

Distributed Real-time log processing platform elk

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.