Users can not only customize the source of the Flume, but also customize the flume sink, the user-defined sink in flume only need to inherit a base class: Abstractsink, and then implement the method in it, For example, my current requirement is that as long as the user uses my custom sink, then it needs to provide a file name, if there is a specific path, you need to fill in the full name of the path, the function is to save the data to the file name, because the user is configuration-based, So instead of having to ignore source sources, let's write the code:
1, to join the project required jar package: Flume-ng-configuration-1.7.0.jar, Flume-ng-core-1.7.0.jar, Flume-ng-sdk-1.7.0.jar
2. Write our custom classes to achieve our needs:
Package com.harderxin.flume.test;
Import Java.io.File;
Import java.io.FileNotFoundException;
Import Java.io.FileOutputStream;
Import java.io.IOException;
Import Org.apache.flume.Channel;
Import Org.apache.flume.Context;
Import org.apache.flume.Event;
Import org.apache.flume.EventDeliveryException;
Import org.apache.flume.Transaction;
Import org.apache.flume.conf.Configurable;
Import Org.apache.flume.sink.AbstractSink;
Import Org.slf4j.Logger;
Import Org.slf4j.LoggerFactory; public class Mysinks extends Abstractsink implements configurable {private static final Logger Logger = loggerfactory
. GetLogger (Mysinks.class);
private static final String Prop_key_rootpath = "FileName";
Private String FileName;
@Override public void Configure (context context) {FileName = context.getstring (Prop_key_rootpath);
} @Override Public Status process () throws eventdeliveryexception {Channel ch = getchannel (); Get the transaction TraNsaction Txn = Ch.gettransaction ();
Event Event =null;
Begin the transaction txn.begin ();
while (true) {event = Ch.take ();
if (event!=null) {break;
}} try {logger.debug ("Get event.");
String BODY = new String (Event.getbody ());
System.out.println ("Event.getbody ()-----" + body);
String res = body + ":" + system.currenttimemillis () + "\ r \ n";
File File = new file (fileName);
FileOutputStream fos = null;
try {fos = new FileOutputStream (file, true);
} catch (FileNotFoundException e) {e.printstacktrace ();
} try {Fos.write (res.getbytes ());
} catch (IOException e) {e.printstacktrace ();
} try {fos.close (); } catch (IOException E) {e.printstacktrace ();
} txn.commit ();
return status.ready;
} catch (Throwable th) {txn.rollback ();
if (th instanceof error) {throw (error) th;
} else {throw new eventdeliveryexception (TH);
}} finally {Txn.close ();
}
}
}
3, the project into a jar package, placed in the Flume decompression Lib file, the dependency package does not need to put in, because Flume in the Lib directory already exists
4. Write our configuration file:
# Specify the component name of the agent
a1.sources = R1
a1.sinks = K1
a1.channels = C1
# Specify the flume source (the path to listen on)
A1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 5678
# specify flume sink
A1.sinks.k1.type = com.harderxin.flume.test.MySinks
a1.sinks.k1.fileName = d://flume-test//sink//mysinks.txt
# Specify flume channel
a1.channels.c1.type = memory
a1.channels.c1.capacity =
a1.channels.c1.transactionCapacity =
# bind source and sink to channel
a1.sources.r1.channels = C1
A1.sinks.k1.channel = C1
Description: The source in the configuration file is the user listens to the local 5678 port on the console telnet command, and then according to the information entered, this information is saved to our configured file according to our custom sink, The A1.sinks.k1.fileName attribute defined in the file is the property of our custom sink, which allows the user to configure the corresponding directory to be created in advance.
5, according to the command, start the flume, start the command in the previous article mentioned, and then test, open the cmd command, use Telnet to listen to 5678 port, and then enter the information inside the return, in the custom Mysinks.txt file will have our information output:
Customizing the Sinks Class:
1, user-defined sink implementation of the configurable interface, is actually implemented inside the Configure (context) method, mainly to obtain user configuration of some information, if we have a lot of properties need users to set their own, Then we can take the user-defined parameters out of this method, and a lot of get methods are available in the context class, such as GetString, Getlong, Getboolean, etc.
2, the core of the processing logic is in the process method, the Getchannel method in the parent class Abstractsink has been implemented, equivalent to obtain the transmission of information to sink channel object, and then it provides a transaction operation method: Gettransaction () and take out the message event method: Take (), these two methods are very important, the object can be obtained to ensure that the information is a custom sink successful consumption, after the successful consumption, using the Commit method to commit the transaction, then the event will be removed from the channel queue, If there is no successful consumption, then using the rollback method to rollback, the event will also remain in the channel queue, so that the next consumption, to ensure that the message will not be missed phenomenon
Take method is mainly to remove the message event, in Flume can also be called events, and then through the GetBody () method, get the details of the message, you can do our function, save to the file or insert into the database and so on
3. Compare the process methods of custom source and custom sink:
Custom Source: Gets the Channelprocessor object through the Getchannelprocessor method, and then passes the message to the channel processing through the processevent method to convert the messages to the Flume event object
Custom Sink: The channel object is obtained through the Getchannel method, then the event is removed from the channel via the Take method and then converted to the message data we need for processing
The process method of source is equivalent to the producer of the event, and the event is continuously sent to the channel, while the sink process method equals the consumer of the event, and the event is constantly removed from the channel for processing
Of course, we can use a combination of custom source and custom sink in our configuration to implement our capabilities as required, configured as follows:
# Specify the component name of the agent
a1.sources = R1
a1.sinks = K1
a1.channels = C1
# Specify the flume source (the path to listen on)
A1.sources.r1.type = Com.harderxin.flume.test.MySource
# specify flume sink
A1.sinks.k1.type = Com.harderxin.flume.test.MySinks
a1.sinks.k1.fileName = d://flume-test//sink//winlog.txt
# specify flume Channel
a1.channels.c1.type = Memory
a1.channels.c1.capacity =
a1.channels.c1.transactionCapacity = 100
A1.channels.c1.byteCapacityBufferPercentage =
a1.channels.c1.byteCapacity = 800000
# Bind source and sink to channel
a1.sources.r1.channels = C1
A1.sinks.k1.channel = C1
To this, is not feeling flume is very strong, Flume claims to be Apache top-notch project, there are a lot of places worth studying and learning, I still on this road continues to continue to refuel ...