Exec Source IntroductionExec source runs the Unix command at startup and expects it to continuously generate data in standard output. (stderr will be discarded unless LOGSTDERR is set to true). If the process exits for some reason, exce source exits and no more data is generated.
Must be configured for coarse-grained labeling:
Property name |
default |
Description |
Channels |
– |
|
type |
– |
Component Name: EXEC |
Command |
– |
Commands to execute |
Shell |
– |
Shell that runs the command |
Restartthrottle |
10000 |
How long (in milliseconds) to sleep before attempting to restart a command process |
Restart |
False |
If the execution command hangs, whether to restart the command process. |
Logstderr |
False |
Whether the error log for this command should be logged. |
BatchSize |
20 |
The maximum number of rows to read and send to the channel at one time. |
BatchTimeout |
3000 |
If the size of the buffer has not yet arrived, take how long (in milliseconds) to wait |
Selector.type |
Replicating |
Copy (replicating) or multiplexing (multiplexing) |
Selector.* |
|
Values that depend on Selector.type |
Interceptors |
– |
A list of interceptors separated by spaces. |
interceptors.* |
|
|
Execsource can collect data in real time, but the data will be lost if Flume does not run or if the shell command fails. For example: Through the tail-f to get nginx access log, if Flume hangs, Nginx Access log continue to import into the log file, then in the flume hang out of this period of time, the newly generated log flume is not available, in order to better guarantee the reliability, You can consider using spooling directory source, take real-time access to the Nginx log, spooling Directory source, although not real-time, but also through the segmentation of log files, to achieve quasi-real-time.
Exec Source Example
a1.sources = R1
a1.channels = C1
a1.sources.r1.type = exec
A1.sources.r1.command = tail-f/var/log/secure< C3/>a1.sources.r1.channels = C1
The ' shell ' configuration is used to invoke command configuration commands (for example, Bash or Powershell). ' Command ' is passed as a argument to ' shell ' for execution. command commands the functionality of the shell script, for example: wildcards (wildcard), back ticks (return token), pipes (pipeline), loops (loop), conditionals (conditional statement), etc., if not configured ' Shell ', then ' command ' will be called directly. The ' shell ' values are generally: '/bin/sh-c ', '/bin/ksh-c ', ' cmd/c ', ' Powershell-command ', and so on.
A1.sources.tailsource-1.type = exec
A1.sources.tailsource-1.shell =/bin/bash-c
A1.sources.tailsource-1.command = for I in/path/*.txt; Do cat $i; Done
Exec Source Source code Analysis
First, configure (context context) method, Execsource the method configuration is relatively simple, refer to the above table can be.
two, start () method
@Override public
void Start () {
logger.info ("Exec source starting with command:{}", command);
Thread pool
executor = Executors.newsinglethreadexecutor ();
Build the Execrunnable thread object, pass in the parameters of the configuration file
runner = new execrunnable (shell, Command, Getchannelprocessor (), Sourcecounter,
restart, restartthrottle, Logstderr, BufferCount, BatchTimeout, CharSet);
Fixme:use a callback-like executor/future to signal us upon failure.
Runnerfuture = Executor.submit (runner);
/
* * Nb:this comes at the end rather than the beginning of the method because
* It sets we state to running. W e want to make sure the executor are alive * and well first
.
*
///Start Counter
sourcecounter.start ();
Super.start ();
Logger.debug ("Exec source Started");
}
Third, execrunnable: This class is the main implementation class of Exec source, inheriting the runnable. Let's take a look at his run method:
@Override public void Run () {do {String ExitCode = "Unknown";
BufferedReader reader = null;
String line = null;
Final list<event> eventlist = new arraylist<event> ();
Timedflushservice = Executors.newsinglethreadscheduledexecutor (new Threadfactorybuilder (). SetNameFormat (
"Timedflushexecservice" + thread.currentthread (). GetId () + "-%d"). Build ());
try {if (shell! = null) {//If there is a configuration shell, the shell is converted to an array by "\\s+", and then the array +command together to form a new set of arrays.
string[] Commandargs = formulateshellcommand (shell, command);
Call the executable system command process = Runtime.getruntime (). exec (Commandargs);
} else {//To convert command via "\\s+" to array string[] Commandargs = Command.split ("\\s+");
Call the executable system command process = new Processbuilder (Commandargs). Start (); }//Read the output of the shell command as an input stream into reader, inPutstreamreader is a bridge of byte flow to a character stream, which reads the Word//section with the specified charset and decodes it to characters, and each time the Read method is called reads one or more bytes from the underlying input stream.
reader = new BufferedReader (New InputStreamReader (Process.getinputstream (), CharSet));
Stderrlogger dies as soon as the input stream is invalid//initialize the error log thread, and if Logstderr is false the log will not print. Stderrreader Stderrreader = new Stderrreader (new BufferedReader (New InputStreamReader (Process.geterrorstrea
M (), CharSet)), Logstderr);
Stderrreader.setname ("stderrreader-[" + Command + "]");
Stderrreader.setdaemon (TRUE);
Stderrreader.start ();
The scheduled task executes once per batchtimeout, in milliseconds future = Timedflushservice.schedulewithfixeddelay (new Runnable () {
@Override public void Run () {try {synchronized (eventlist) { EventList cannot be empty and timeout if (!eventlist.isempty () && timeout ()) {//Execute flush Flusheventbatch (eventlist); }}} catch (Exception e) {logger.error ("Exception occured when Proce
Ssing event Batch ", E);
if (e instanceof interruptedexception) {thread.currentthread (). interrupt ();
}}}, BatchTimeout, BatchTimeout, timeunit.milliseconds); By stream, read while (line = Reader.readline ())! = null) {synchronized (eventlist) {sou
Rcecounter.incrementeventreceivedcount ();
Eventlist.add (Eventbuilder.withbody (Line.getbytes (CharSet)));
Event size exceeds batchsize, or timeout, flush to channel if (Eventlist.size () >= BufferCount | | timeout ()) {
Flusheventbatch (EventList);
}}}//Byte stream does not already have data, execute flush synchronized (eventlist) { if (!eventlist.isempty ()) {Flusheventbatch (eventlist);
}}} catch (Exception e) {logger.error ("Failed while running command:" + command, E);
if (e instanceof interruptedexception) {thread.currentthread (). interrupt ();
}} finally {if (reader! = null) {try {reader.close ();
} catch (IOException ex) {Logger.error ("Failed to close reader for exec source", ex);
}}//kill child process ExitCode = string.valueof (Kill ()); } if (restart) {Logger.info ("Restarting in {}ms, exit code {}", Restartthrottle, ExitCode)
;
try {//How long to hibernate before restarting the command process Thread.Sleep (restartthrottle);
} catch (Interruptedexception e) {thread.currentthread (). interrupt (); }} else {Logger.info ("Command ["+ Command +"] exited with "+ ExitCode);
}} while (restart); Restart configuration refers to whether the process of restarting the command if the shell command is hung, the default is False, and if configured to True, all the code just now loops through}
Four, Flusheventbatch method
private void Flusheventbatch (list<event> eventlist) {
//batch Event
Channelprocessor.processeventbatch (eventlist);
Statistical
Sourcecounter.addtoeventacceptedcount (Eventlist.size ());
Clear list
eventlist.clear ();
Gets the time of the last push to channel, which is easy to judge timeout
Lastpushtochannel = Systemclock.currenttimemillis ();
}
The above is the Execsource general process.