Official Document parameter explanation: Http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
Need to note: file format, filetype=datastream default is Sequencefile, is the Hadoop file format, to DataStream can be read directly (Sqeuencefile How to use still do not know. )
Configuration file:
Hdfs.conf
A1.sources = R1 A1.sinks = K1 A1.channels = C1
# Describe/configure The source A1.sources.r1.type = Spooldir A1.sources.r1.channels = C1 A1.sources.r1.spoolDir =/usr/local/hadoop/apache-flume-1.6.0-bin/logs A1.sources.r1.fileHeader = True
# Describe The sink A1.sinks.k1.type = HDFs A1.sinks.k1.channel = C1 A1.sinks.k1.hdfs.path = hdfs://node4:9000/user/flume/logs/%y-%m-%d-%h A1.sinks.k1.hdfs.filePrefix = Syslog #a1. Sinks.k1.hdfs.fileSuffix =. Log #设定后缀 A1.sinks.k1.hdfs.round = True A1.sinks.k1.hdfs.roundValue = 10 A1.sinks.k1.hdfs.roundUnit = Minute #--The file size of the roll operation in bytes (0:never roll based on file) A1.sinks.k1.hdfs.rollSize = 128000000 #--number of events written to the file before the roll operation (0 = never roll based on numbers of events) A1.sinks.k1.hdfs.rollCount = 0 #--file format: Default sequencefile, optional DataStream \ Compressedstream A1.sinks.k1.hdfs.fileType = DataStream #DataStream可以直接读出来 #--format for sequence file records. "Text" or "writable" A1.sinks.k1.hdfs.writeFormat = Text #--Use local time to replace the transfer character (instead of using the timestamp of the event header) A1.sinks.k1.hdfs.useLocalTimeStamp = True
# Use a channel which buffers events in memory A1.channels.c1.type = Memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel A1.sources.r1.channels = C1 A1.sinks.k1.channel = C1 |
Start Hadoop
Start Flume:
./flume-ng Agent-c. -f/usr/local/hadoop/apache-flume-1.6. 0-bin/conf/hdfs.conf-n a1-dflume.root.logger=info,console
Generate the log file in the folder being monitored:
for inch {.. - Do Echo " Test Line $i " >>/usr/local/hadoop/apache-flume-1.6. 0-bin/logs/spool_text$i.log; done;
View hdfs:http://node4:50070
Flume note--source-side listening directory, sink upload to HDFs