These three functions are log collection, index and search, and visualized display.
L logstash
This architecture diagram shows that logstash is only the place where collect and index are located. A. conf file is input during runtime, And the configuration is divided into three parts: input, filter, and output.
L redis
Redis serves as a decoupling between log collection and indexing.
L elasticsearch
Core Component used for searching. Main features: Real-Time, distributed, highly available, document oriented, schema free, restful
Kibana
Visualized log components make Data Interaction easier
Deploy required components
- Logstash https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
- Redis http://download.redis.io/releases/redis-stable.tar.gz.
- Elasticsearch https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.2.zip
- Kibana https://github.com/elasticsearch/kibana
Logstash
Logstash 10-minute Tutorial: http://logstash.net/docs/1.4.2/tutorials/10-minute-walkthrough/
Download and decompress the latest logstash version
Edit the logstash. conf configuration file
Logstash User Manual: http://logstash.net/docs/1.4.2/
Log4j server configuration instance: log4j. conf
Input { Log4j { Data_timeout => 5 # Mode => "server" # Port => 4560 } } Filter { JSON { Source => "message" Remove_field => ["message", "class", "file", "host", "method", "path", "Priority", "Thread", "type ", "logger_name"] } } Output { # Stdout {codec => JSON} Redis { Host => "redis. internal.173" Port = & gt; 6379 Data_type => "list" Key => "soalog" } } |
Logstash output elasticsearch configuration instance: soalog-es.conf
Input { Redis { Host => "redis. internal.173" Port => "6379" Key => "soalog" Data_type => "list" } } Filter { JSON { Source => "message" Remove_field => ["message", "type"] } } Output { Elasticsearch { # Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173" Cluster => "soaes" Index => "soa_logs-% {+ YYYY. Mm. dd }" } } |
Here, the filter configuration source => message is to parse the JSON string in the message as an index field, and then configure remove_field to delete unnecessary fields.
Start
./Logstash-F soalog-es.conf -- verbose-l ../soalog-es.log &
./Logstash-F log4j. conf -- verbose-l ../log4j. log &
Elastcisearch
Download the latest elasticsearch version and decompress it.
Bin/elasticsearch-D backend operation
Verify
Elasticsearch cluster configuration:
Edit config/elasticsearch. yml
# Specify the cluster name. elasticsearch is used by default.
Cluster. Name: soaes
# Specify the data storage directory, which can be multiple disks/path/to/data1,/path/to/data2
Path. Data:/mnt/hadoop/esdata
# Specify the log storage directory
Path. logs:/mnt/hadoop/eslogs
# Cluster master node list, execute to discover new nodes
Discovery. Zen. Ping. unicast. hosts: ["hadoop74", "hadoop75"]
Configure the es template to specify whether the field is indexed and the storage type.
Create the templates directory under the config directory
Add template file template-soalogs.json
{ "Template-soalogs ":{ "Template": "soa_logs *", "Settings ":{ "Index. number_of_shards": 5, "Number_of_replicas": 1, "Index ":{ "Store ":{ "Compress ":{ "Stored": True, "TV": True } } } }, "Mappings ":{ "Logs ":{ "Properties ":{ "Providernode ":{ "Index": "not_analyzed ", "Type": "string" }, "Servicemethod ":{ "Index": "not_analyzed ", "Type": "string" }, "Appid ":{ "Index": "not_analyzed ", "Type": "string" }, "Status ":{ "Type": "long" }, "Srcappid ":{ "Index": "not_analyzed ", "Type": "string" }, "Remark ":{ "Type": "string" }, "Serviceversion ":{ "Index": "not_analyzed ", "Type": "string" }, "Srcserviceversion ":{ "Index": "not_analyzed ", "Type": "string" }, "Logside ":{ "Type": "long" }, "Invoketime ":{ "Type": "long" }, "@ Version ":{ "Type": "string" }, "@ Timestamp ":{ "Format": "dateoptionaltime ", "Type": "date" }, "Srcserviceinterface ":{ "Index": "not_analyzed ", "Type": "string" }, "Serviceinterface ":{ "Index": "not_analyzed ", "Type": "string" }, "Retrycount ":{ "Type": "long" }, "Traceid ":{ "Index": "not_analyzed ", "Type": "string" }, "Processtime ":{ "Type": "long" }, "Consumernode ":{ "Index": "not_analyzed ", "Type": "string" }, "Rpcid ":{ "Index": "not_analyzed ", "Type": "string" }, "Srcservicemethod ":{ "Index": "not_analyzed ", "Type": "string" } } } } } } |
Kibana
Go to the elasticsearch directory
Bin/Plugin-install elasticsearch/kibana
Verification: http: // localhost: 9200/_ plugin/kibana
Kibana needs to configure query index rules
Here the index is soa_logs, which needs to be specified as YYYY-MM-DD by the talent index format
Logstash time difference: 8 hours
When logstash is output to elasticsearch every day, because the time zone uses UTC, the index of the day is created at every day, and the data before is output to the index of the previous day.
You can modify logstash/lib/logstash/event. RB to solve this problem.
226th rows
. Withzone (Org. joda. Time. datetimezone: UTC)
Change
. Withzone (Org. joda. Time. datetimezone. getdefault ())
Log4j. properties configuration
# Remote logging
Log4j. additi.pdf. logstash = false
Log4j. Logger. logstash = info, logstash
Log4j. appender. logstash = org.apache.log4j.net. socketappender
Log4j. appender. logstash. remotehost = localhost
Log4j. appender. logstash. Port = 4560
Log4j. appender. logstash. locationinfo = false
Java log output
Private Static finalorg. slf4j. Logger logstash = org. slf4j. loggerfactory. getlogger ("logstash ");
Logstash.info (jsonobject. tojsonstring (rpclog ));
Kopf
Elasticsearch Cluster Monitoring
Bin/Plugin-installlmenezes/elasticsearch-Kopf
Http: // localhost: 9200/_ plugin/Kopf
Example of logstash Tomcat access logs:
Configure tomcat. conf on the logstash agent
Input { File { Type => "USAP" Path => ["/opt/17173/Apache-Tomcat-7.0.50-8090/logs/Catalina. out ","/opt/17173/Apache-Tomcat-7.0.50-8088/logs/Catalina. out ","/opt/17173/Apache-Tomcat-7.0.50-8086/logs/Catalina. out ","/opt/ 17173/Apache-Tomcat-7.0.50-8085/logs/Catalina. Out ","/opt/17173/apache-tomcat-6.0.37-usap-image/logs/Catalina. Out "] Codec => multiline { Pattern => "(^. + exception :. +) | (^ \ s +. +) | (^ \ s +... \ D + more) | (^ \ s * caused :. + )" What => "previous" } } } Filter { Grok { # Match => {"message" => "% {combinedapachelog }"} Match => ["message", "% {tomcatlog}", "message", "% {catalinalog}"] Remove_field => ["message"] } } Output { # Stdout {codec => rubydebug} Redis {Host => "redis. internal.173" data_type => "list" Key => "USAP "} } |
Modify logstash/patterns/grok-patterns
Added Tomcat LOG filter Regular Expressions
# Tomcat log Javaclass (? : [A-zA-Z0-9-] + \ :) + [A-Za-z0-9 $] + Javalogmessage (.*) Thread [A-Za-z0-9 \-\ [\] + # Mmm DD, yyyy hh: mm: SS eg: Jan 9, 2014 7:13:13 AM Catalina_datestamp % {month }%{ monthday}, 20% {year }%{ hour }:? % {Minute }(? ::? % {Second })(? : Am | pm) # Yyyy-mm-dd hh: mm: SS, sss zzz eg: 17:32:25, 527-0800 Tomcat_datestamp 20% {year}-% {monthnum}-% {monthday }%{ hour }:? % {Minute }(? ::? % {Second}) % {iso8601_timezone} Log_time % {hour }:? % {Minute }(? ::? % {Second }) Catalinalog % {catalina_datestamp: Timestamp }%{ javaclass: Class }%{ javalogmessage: logmessage} #11:27:51, 786 [http-bio-8088-exec-4] Debug jsonrpcserver: 504-invoking method: gethistory # Tomcatlog % {log_time: Timestamp }%{ thread: thread }%{ loglevel: Level }%{ javaclass: Class }-% {javalogmessage: logmessage} Tomcatlog % {tomcat_datestamp: Timestamp }%{ loglevel: Level }%{ javaclass: Class }-% {javalogmessage: logmessage} |
Start Tomcat log agent:
./Logstash-F tomcat. conf -- verbose-l ../tomcat. log &
Tomcat logs are stored in ES
Configure tomcat-es.conf
Input { Redis { Host => 'redis. internal.173' Data_type => 'LIST' Port => "6379" Key => 'usap' # Type => 'redis-input' # Codec => JSON } } Output { # Stdout {codec => rubydebug} Elasticsearch { # Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173" Cluster => "soaes" Index => "USAP-% {+ YYYY. Mm. dd }" } } |
Start Tomcat log storage
./Logstash-F tomcat-es.conf -- verbose-l ../tomcat-es.log &
Example of logstash accessing nginx \ syslog log
Configure nginx. conf on the logstash agent
Input { File { Type => "Linux-syslog" Path => ["/var/log/*. log", "/var/log/messages"] } File { Type => "nginx-access" Path => "/usr/local/nginx/logs/access. log" } File { Type => "nginx-error" Path => "/usr/local/nginx/logs/error. log" } } Output { # Stdout {codec => rubydebug} Redis {Host => "redis. internal.173" data_type => "list" Key => "nginx "} } |
Start nginx log agent
./Logstash-F nginx. conf -- verbose-l ../nginx. log &
Store nginx logs to es
Configure nginx-es.conf
Input { Redis { Host => 'redis. internal.173' Data_type => 'LIST' Port => "6379" Key => 'nginx' # Type => 'redis-input' # Codec => JSON } } Filter { Grok { Type => "Linux-syslog" Pattern => "% {syslogline }" } Grok { Type => "nginx-access" Pattern => "% {iporhost: source_ip}-% {Username: remote_user} \ [% {httpdate: Timestamp} \] % {iporhost: Host} % {QS: request }%{ INT: Status }%{ INT: body_bytes_sent }%{ QS: http_refere R }%{ QS: http_user_agent }" } } Output { # Stdout {codec => rubydebug} Elasticsearch { # Host => "es1.internal. 173, es2.internal. 173, es3.internal. 173" Cluster => "soaes" Index => "nginx-% {+ YYYY. Mm. dd }" } } |
Start nginx log storage
./Logstash-F nginx-es.conf -- verbose-l ../nginx-es.log &
Distributed Real-time log processing platform elk