FileChannel is a very important channel component of flume and is very often used. This channel is very complex and involves more than three packages: Org.apache.flume.channel.file, org.apache.flume.channel.file.encryption (encrypted), Org.apache.flume.channel.file.proto a total of 40 source files.
First, configure (context) method:
1. First get the Checkpointdir and Datadirs properties in the configuration file, which is the directory where checkpoints and data are stored, by default using $user.home/.flume/file-channel/checkpoint and $user.home/. Flume/file-channel/data to; Checkpointdir is a directory, and Datadirs can be multiple "," split, and the two directories are best not to change, because there is data stored in it;
2, get capacity capacity, and do some checks such as whether the <0, whether it is dynamic loading (there is no change?) , default 1000000; this refers to the maximum capacity of the checkpoint file to store event information.
3, KeepAlive timeout, that is, if the channel no data for the longest wait time, the default 3s;
4, the maximum capacity of the transactioncapacity transaction, the default 1000;
5, Checkpointinterval checkpoint write interval, the default 30000ms;
6, Maxfilesize,data file size of the upper limit, the user set and 1623195647 between the smaller one;
7, at least how much space to Minimumrequiredspace,max (User Configuration, 500M), 1M);
8, useLogReplayV1, default false;
9, Usefastreplay, default false;
10, Encryptionactivekey, encryption key alias, default to null;
11, Encryptioncontext encryption configuration information;
12, Encryptioncipherprovider encryption password provider, the default value is null;
13, Encryptionkeyprovidername, encryption key provider, the default value is null;
14, queueremaining, whether the queue has the remaining space signal volume, initialization capacity for capacity;
15, set log log check interval checkpointinterval and maxfilesize maximum file size.
16, whether to create a new counter channelcounter.
Second, the start () method.
1, through the Log.builder () to build a Builder object, and set the corresponding parameters, and then log = Builder.build (), log construction method will checkpointdir and Logdirs attempt to acquire the lock operation, So if there are multiple file channel then Checkpointdir and Logdirs are best configured under multiple disks or multiple directories, otherwise only one can be initialized Log is used to write encapsulated flumeevent to disk and to store pointers to these event in a memory queue. will create a thread working content is every checkpointinterval millisecond, the default 30s write a checkpoint log.writecheckpoint (), will Checpoint, Inflighttakes, Inflightputs are flushed to disk, will be inflightputs, Inflighttakes, Checkpoint.meta rebuild, update checkpoint files and refresh to disk, these files are in Checkpointdir directory, update Log-id.meta files, and assume responsibility for deleting log files and their corresponding meta files.
2, Log.replay (), once a log object is created, you need to call the replay () method to use the queue's most recent checkpoint to adjust the write ahead log on the disk. Gets the maximum Fileid, then reads the log file to perform the corresponding operation according to the record type, iterates through all the data directories, then creates the Logfile.writer (empty) Roll (index), and then flushes the queue to the related file.
3, open = True, means channel open;
4, Depth = Getdepth (), flumeeventqueue size, and then need to determine whether the queueremaining have enough surplus queueremaining.tryacquire (depth);
5, if open==true, the counter began to work.
The CreateTransaction () method mainly constructs a Filebackedtransaction object to directly manipulate the channel and returns.
Stop () channel, clean up the data.