Flume+log4j+kafka

Source: Internet
Author: User
Tags getmessage log log disk usage elastic search

A scheme of log acquisition architecture based on Flume+log4j+kafka

This article will show you how to use Flume, log4j, Kafka for the specification of log capture.

Flume Basic Concepts

Flume is a perfect, powerful log collection tool, about its configuration, on the internet there are many examples and information available, here only to do a simple explanation is no longer detailed.
The flume contains the three most basic concepts of source, Channel, and sink:

source--log sources, including: Avro source, Thrift Source, Exec Source, JMS source, spooling Directory source, Kafka source, NetCat Source, Sequence Generator source, Syslog Source, HTTP Source, Stress Source, Legacy Source, Custom Source, Scribe Source and Twitter 1% firehose source.

channel--log pipeline, all log data from source will be stored in a queue, including: Memory Channel, JDBC Channel, Kafka Channel, File Channel, spillable Memory Channel, Pseudo Transaction Channel, Custom channel.

sink--log exit, the log will be launched outward via Sink, which includes: HDFS Sink, Hive Sink, Logger Sink, Avro Sink, Thrift Sink, IRC Sink, File roll Sink, Null Sink, HBase Sink, Async HBase Sink, Morphline Solr Sink, Elastic Search Sink, Kite Dataset Sink, Kafka Sink, Custom Sink.

Flume-based log capture is flexible, and we can see that both Avro Sink and Avro Source, both thrift sink and thrift source, mean that we can concatenate multiple pipeline processes as shown in:

The significance of concatenation is that we can combine multiple pipelines into a single pipe and eventually output to the same sink, such as:

The above describes the role of source and sink, and the channel is to deal with different sink, assuming that we have a source to correspond to more than one sink, you only need to establish multiple channel for a source, as follows:

A source if you want to output to more than one sink, you need to establish multiple channel to intervene and the final output, through the above diagram, we can well understand the flume operation mechanism, we are here also donuts, detailed configuration can be on the official website or on the Internet search, View to.

In general, we use the Exec source to monitor the log file, it is quite simple, but not convenient, we need to deploy flume on each server to monitor, in case the target log file IO exception (such as format change, file name change, File is locked), it is also very painful, so we'd better let the log sent directly through the socket, rather than local, so that not only reduce the target server disk usage, but also effective to prevent file IO exception, and Kafka is a better solution, The specific architecture is as follows:

As you can see, the logs eventually flow to two places: HBase persistence and Realtime Processor, and the reason why you don't need Kafka to communicate directly with Storm is to isolate sotrm logic and log sources through Flume. After a simple analysis of the logs in storm, the results are thrown into Rabbit MQ for WEB apps to consume.

HBase persistence is the original log recorded in HBase in order to return the query, and realtime processor contains real-time log statistics and error exception email alerts and other functions.

In order to be able to capture the abnormal data accurately, we also need to do some normalization of the program, such as providing a uniform exception handling handle and so on.

Log Output format

Since it is intended to unify the log, a uniform, canonical log format is very important, and the patternlayout we used in the past is very inconvenient for the segmentation of the final field, as shown below:

2016-05-08 19:32:55,572 [INFO] [main]-[Com.banksteel.log.demo.log4j.Demo.main (DEMO.JAVA:13)] Output information ...
2016-05-08 19:32:55,766 [Debug] [main]-[Com.banksteel.log.demo.log4j.Demo.main (DEMO.JAVA:15)] Debugging information ...
2016-05-08 19:32:55,775 [WARN] [main]-[Com.banksteel.log.demo.log4j.Demo.main (DEMO.JAVA:16)] warning message ...
2016-05-08 19:32:55,783 [ERROR] [main]-[Com.banksteel.log.demo.log4j.Demo.main (DEMO.JAVA:20)] An error occurred while processing the business logic ... ...
Java.lang.Exception: Error message AH
at Com.banksteel.log.demo.log4j.Demo.main (demo.java:18)
at SUN.REFLECT.NATIVEMETHODACCESSORIMPL.INVOKE0 (Native Method)
at Sun.reflect.NativeMethodAccessorImpl.invoke ( nativemethodaccessorimpl.java:57)
at Sun.reflect.DelegatingMethodAccessorImpl.invoke ( delegatingmethodaccessorimpl.java:43)
at Java.lang.reflect.Method.invoke (method.java:606)
at Com.intellij.rt.execution.application.AppMain.main (appmain.java:144)

How to parse this log, is a very troublesome place, in case a system developer output log does not conform to the established specification of Patternlayout will throw an exception.

In order to solve the format problem once and for all, we use jsonlayout to standardize the log output, for example, the format of the jsonlayout output provided in the log4j 2.X version is as follows:

{"Timemillis": 1462712870612, "thread": "Main", "Level": "FATAL", "Loggername": "Com.banksteel.log.demo.log4j2.De Mo "," message ":" There has been an exception that could affect the program to continue running! "," thrown ": {" Commonelementcount ": 0," localizedmessage ":" Error message AH "," message ":" Error message Ah "," name ":" Java.l Ang.      Exception "," extendedstacktrace ": [{" Class ":" Com.banksteel.log.demo.log4j2.Demo "," Method ":" Main ",    "File": "Demo.java", "line": +, "exact": True, "location": "Classes/", "Version": "?" }, {"Class": "Sun.reflect.NativeMethodAccessorImpl", "Method": "Invoke0", "File": "Nativemethodaccessor Impl.java "," line ":-2," exact ": false," location ":"? "," Version ":" 1.7.0_80 "}, {" Class      ":" Sun.reflect.NativeMethodAccessorImpl "," Method ":" Invoke "," file ":" Nativemethodaccessorimpl.java ", "Line": $, "exact": false, "location": "?", "Version": "1.7.0_80"}, {"Class" : "Sun.reflect.DelegatingMethodAccessorImpl", "Method": "Invoke", "file": "Delegatingmethodaccessorimpl.java" , "line": +, "exact": false, "location": "?", "Version": "1.7.0_80"}, {"Class": "java."      Lang.reflect.Method "," Method ":" Invoke "," file ":" Method.java "," line ": 606," exact ": false,      "Location": "?", "Version": "1.7.0_80"}, {"Class": "Com.intellij.rt.execution.application.AppMain", "Method": "Main", "File": "Appmain.java", "line": 144, "exact": True, "location": "Idea_rt.jar"    , "version": "?" }]}, "Endofbatch": false, "LOGGERFQCN": "Org.apache.logging.log4j.spi.AbstractLogger", "source": {"Class": " Com.banksteel.log.demo.log4j2.Demo "," Method ":" Main "," File ":" Demo.java "," line ": 23}}

We see that this format can be easily parsed in whatever language.

Kafka integration of the log framework

We only use log4j 1.x and log4j 2.x for example.

log4j 1.x integration with Kafka

First of all, the contents of Pom.xml are as follows:

<dependencies> <dependency> <groupId>log4j</groupId> <artifactid>log4j</ artifactid> <version>1.2.17</version> </dependency> <dependency> <groupi D>com.fasterxml.jackson.core</groupid> <artifactId>jackson-core</artifactId> <versio n>2.7.4</version> </dependency> <dependency> &LT;GROUPID&GT;COM.FASTERXML.JACKSON.CORE&L    T;/groupid> <artifactId>jackson-databind</artifactId> <version>2.7.4</version> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <art ifactid>jackson-annotations</artifactid> <version>2.7.4</version> </dependency> &L T;dependency> <groupId>org.apache.kafka</groupId> &LT;ARTIFACTID&GT;KAFKA-CLIENTS&LT;/ARTIFAC Tid> <version>0.8.2.1</version> </dependency> <dependency> <groupid>org.apache.kafka</grou pid> <artifactId>kafka_2.11</artifactId> <version>0.8.2.1</version> </depen Dency></dependencies>

Note that the Kafka version number we are using here is 0.8.2.1, but the corresponding 0.9.0.1 is available and 0.9.0.1 can only be used with 0.8.2.1 to not have an exception (specific exceptions can be tried on their own).

And log4j 1.x itself is not jsonlayout available, so we need to implement a class ourselves, as follows:

Package Com.banksteel.log.demo.log4j;import Com.fasterxml.jackson.core.jsonprocessingexception;import Com.fasterxml.jackson.databind.objectmapper;import Org.apache.log4j.layout;import Org.apache.log4j.spi.loggingevent;import Java.util.linkedhashmap;import Java.util.linkedlist;import Java.util.list;import java.util.map;/** * Expands log4j 1.x so that it supports jsonlayout, as is the case with log4j2.x based on Jackson, whose format is also fully referenced in log4j 2. X is implemented. * * @author hot-blooded bug man * @version 1.0.0 * @since Created by gebug on 2016/5/8.    */public class Jsonlayout extends Layout {private final Objectmapper mapper = new Objectmapper ();        Public String Format (loggingevent loggingevent) {string json;        map<string, object> map = new linkedhashmap<string, object> (0);        map<string, object> Source = new linkedhashmap<string, object> (0);        Source.put ("Method", Loggingevent.getlocationinformation (). Getmethodname ());        Source.put ("Class", Loggingevent.getlocationinformation (). GetClassName ()); SoUrce.put ("File", Loggingevent.getlocationinformation (). GetFileName ());        Source.put ("line", Safeparse (Loggingevent.getlocationinformation (). Getlinenumber ()));        Map.put ("Timemillis", Loggingevent.gettimestamp ());        Map.put ("Thread", Loggingevent.getthreadname ());        Map.put ("Level", Loggingevent.getlevel (). toString ());        Map.put ("Loggername", Loggingevent.getlocationinformation (). GetClassName ());        Map.put ("source", source);        Map.put ("Endofbatch", false);        Map.put ("Loggerfqcn", Loggingevent.getfqnofloggerclass ());        Map.put ("Message", Safetostring (Loggingevent.getmessage ()));        Map.put ("Thrown", Formatthrowable (loggingevent));        try {json = mapper.writevalueasstring (map);        } catch (Jsonprocessingexception e) {return e.getmessage ();    } return JSON; } private List<map<string, object>> formatthrowable (loggingevent le) {if (le.getthrowableinformatio    N () = = NULL | |            Le.getthrowableinformation (). getthrowable () = = null) return null;        List<map<string, object>> traces = new linkedlist<map<string, object>> ();        map<string, object> throwablemap = new linkedhashmap<string, object> (0);        stacktraceelement[] stacktraceelements = Le.getthrowableinformation (). Getthrowable (). Getstacktrace (); for (Stacktraceelement stacktraceelement:stacktraceelements) {throwablemap.put ("class", stacktraceelement.ge            Tclassname ());            Throwablemap.put ("File", Stacktraceelement.getfilename ());            Throwablemap.put ("line", Stacktraceelement.getlinenumber ());            Throwablemap.put ("Method", Stacktraceelement.getmethodname ());            Throwablemap.put ("Location", "?");            Throwablemap.put ("Version", "?");        Traces.add (THROWABLEMAP);    } return traces; } private static String safetostring (Object obj) {if (obj = = null) return null;        try {return obj.tostring ();        } catch (Throwable t) {return "Error getting message:" + t.getmessage ();        }} private static Integer safeparse (String obj) {try {return Integer.parseint (obj.tostring ());        } catch (NumberFormatException t) {return null;    }} public Boolean ignoresthrowable () {return false; } public void Activateoptions () {}}

In fact, it is not complicated, notice that some of them cannot get information, use? Instead, the reserved field is intended to be exactly the same as the Log4j 2.x log format, with the configuration log4j.properties as follows Kafka:

Log4j.rootlogger=info,consolelog4j.logger.com.banksteel.log.demo.log4j=debug,kafkalog4j.appender.kafka= Kafka.producer.kafkalog4jappenderlog4j.appender.kafka.topic=server_loglog4j.appender.kafka.brokerlist=kafka-01 : 9092,kafka-02:9092,kafka-03:9092log4j.appender.kafka.compressiontype=nonelog4j.appender.kafka.syncsend= Truelog4j.appender.kafka.layout=com.banksteel.log.demo.log4j.jsonlayout
# Appender Consolelog4j.appender.console=org.apache.log4j.consoleappenderlog4j.appender.console.target= system.outlog4j.appender.console.layout= org.apache.log4j.patternlayoutlog4j.appender.console.layout.conversionpattern=%d [%-5p] [%t]-[%l]%m%n

By printing the log we can see the final format of its output as follows:

{"Timemillis": 1462713132695, "thread": "Main", "Level": "ERROR", "Loggername": "Com.banksteel.log.demo.log4j.Demo",  "Source": {"method": "Main", "Class": "Com.banksteel.log.demo.log4j.Demo", "File": "Demo.java", "line": 20     }, "Endofbatch": false, "LOGGERFQCN": "Org.slf4j.impl.Log4jLoggerAdapter", "message": "An error occurred while processing business logic ...", "thrown": [ {"Class": "Com.intellij.rt.execution.application.AppMain", "File": "Appmain.java", "line": 144, "    Method ":" Main "," Location ":"? "," Version ":"? "      }, {"Class": "Com.intellij.rt.execution.application.AppMain", "File": "Appmain.java", "line": 144,    "Method": "Main", "Location": "?", "Version": "?"      }, {"Class": "Com.intellij.rt.execution.application.AppMain", "File": "Appmain.java", "line": 144,    "Method": "Main", "Location": "?", "Version": "?" }, {"Class": "Com.intellij.rt.execution.application.AppMain", "FIle ":" Appmain.java "," line ": 144," Method ":" Main "," Location ":"? "," Version ":"? "      }, {"Class": "Com.intellij.rt.execution.application.AppMain", "File": "Appmain.java", "line": 144,    "Method": "Main", "Location": "?", "Version": "?"      }, {"Class": "Com.intellij.rt.execution.application.AppMain", "File": "Appmain.java", "line": 144,    "Method": "Main", "Location": "?", "Version": "?" }  ]}

Test class:

Package Com.banksteel.log.demo.log4j;import org.slf4j.logger;import org.slf4j.loggerfactory;/** * @author hot-blooded bug man * @ Version 1.0.0 * @since Created by gebug on 2016/5/8. */public class Demo {    private static final Logger Logger = Loggerfactory.getlogger (demo.class);    public static void Main (string[] args) {        logger.info ("Output information ...");        Logger.trace ("Random print ...");        Logger.debug ("Debug info ...");        Logger.warn ("warning message ...");        try {            throw new Exception ("error message ah"),        } catch (Exception e) {            logger.error ("An error occurred while processing business logic ...", e);        }    }}

Log4j 2.x integration with Kafka

LOG4J 2.x natural support jsonlayout, and integration with Kafka convenient, we only need to step-by-step configuration is good, pom.xml as follows:

<dependencies> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactid& gt;log4j-api</artifactid> <version>2.5</version> </dependency> <dependency> < Groupid>org.apache.logging.log4j</groupid> <artifactId>log4j-core</artifactId> <version& gt;2.5</version> </dependency> <dependency> <groupid>com.fasterxml.jackson.core</ groupid> <artifactId>jackson-core</artifactId> <version>2.7.4</version> </depende ncy> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactid>jackson- databind</artifactid> <version>2.7.4</version> </dependency> <dependency> <gr Oupid>com.fasterxml.jackson.core</groupid> <artifactId>jackson-annotations</artifactId> &lt ; VERSION&GT;2.7.4&LT;/VERSION&GT </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactid>kafka_ 2.11</artifactid> <version>0.9.0.1</version> </dependency></dependencies>

The log4j2.xml configuration file looks like this:

<?xml version= "1.0" encoding= "UTF-8"?><!--log4j2 configuration file--><configuration status= "DEBUG" strict= "true" Name= "Log4j2_demo" packages= "com.banksteel.log.demo.log4j2" > <properties> <property name= "LogPath" &G T;log</property> </properties> <Appenders> <!--configuration console output style-<console name= " Console "target=" System_out "> <patternlayout pattern="%highlight{%d{yyyy-mm-dd HH:mm:ss}%d{unix_millis} [%t]%-5p%c{1.}:%l-%msg%n} "/> </Console> <!--configuration Kafka log active acquisition, Storm will parse the log into a field stored in HBase.  --<kafka name= "Kafka" topic= "Server_log" > <!--using JSON to transfer log files--<jsonlayout Complete= "true" locationinfo= "true"/> <!--Kafka cluster configuration, need to configure the hosts file natively, or via Nginx configuration--<prop Erty name= "Bootstrap.servers" >Kafka-01:9092,Kafka-02:9092,Kafka-03:9092</Property> </Kafka> </    Appenders> <Loggers>    <root level= "DEBUG" > <!--Enable console output log--<appenderref ref= "Console"/> <!--enable Kafka capture log--<appenderref ref= "Kafka"/> </Root> </loggers></configur Ation>

That's okay, we can see the full output in Kafka:

{"Timemillis": 1462712870591, "thread": "Main", "Level": "ERROR", "Loggername": "Com.banksteel.log.demo.log4j2.De Mo "," message ":" An error occurred while processing business logic ... "," thrown ": {" Commonelementcount ": 0," localizedmessage ":" Error message Ah "," mes Sage ":" Error message Ah "," name ":" Java.lang.Exception "," extendedstacktrace ": [{" Class ":" Com.banksteel.log.demo.l Og4j2. Demo "," Method ":" Main "," File ":" Demo.java "," line ": +," exact ": True," location ":" Classe    s/"," Version ":"? " }, {"Class": "Sun.reflect.NativeMethodAccessorImpl", "Method": "Invoke0", "File": "Nativemethodaccessor Impl.java "," line ":-2," exact ": false," location ":"? "," Version ":" 1.7.0_80 "}, {" Class      ":" Sun.reflect.NativeMethodAccessorImpl "," Method ":" Invoke "," file ":" Nativemethodaccessorimpl.java ", "Line": $, "exact": false, "location": "?", "Version": "1.7.0_80"}, {"Class": "sUn.reflect.DelegatingMethodAccessorImpl "," Method ":" Invoke "," file ":" Delegatingmethodaccessorimpl.java ", "Line": +, "exact": false, "location": "?", "Version": "1.7.0_80"}, {"Class": "Java.lang". Reflect.  Method "," Method ":" Invoke "," file ":" Method.java "," line ": 606," exact ": false," location ": "?", "Version": "1.7.0_80"}, {"Class": "Com.intellij.rt.execution.application.AppMain", "Method": "Main", "File": "Appmain.java", "line": 144, "exact": True, "location": "Idea_rt.jar", "Versi    On ":"? " }]}, "Endofbatch": false, "LOGGERFQCN": "Org.apache.logging.log4j.spi.AbstractLogger", "source": {"Class": " Com.banksteel.log.demo.log4j2.Demo "," Method ":" Main "," File ":" Demo.java "," line ": 22}}

In order to reduce the log space footprint, we usually also set the compact property of Jsonlayout to True, so that the logs obtained in Kafka will exclude spaces and line breaks.

At last

Because in real development, we will introduce a number of third-party dependencies, these dependencies will often also rely on countless log log framework, in order to ensure the test pass, please recognize the package name and version number in this example, log4j 1.x Json output is to fully simulate the 2.x field, so some fields use? Instead, if you want to be perfect, fix it yourself.

Simply explain the log level to establish the specification:

Log.error error messages, usually written in catch, can be used to log detailed exception stacks using Log.error ("An error occurred", E)

Log.fatal Critical Error, this level of error is used to record the error log that causes the program to exit unexpectedly.

Log.warn Warning

Log.info Information

Log.trace Simple Output text

Log.debug Debugging Information

Tags: Flume, log4j 1.x, Log4j 2.x, Kafka, Jsonlayout

Flume+log4j+kafka

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.