This article describes Flume (spooling Directory source) + HDFS, and some of the source details in Flume are described in the article http://www.cnblogs.com/cnmenglang/p/6544081.html
1. Material Preparation: apache-flume-1.7.0-bin.tar.gz
2. Configuration steps:
A. Upload to User (LZ user MFZ) directory under Resources
B. Unzip
TAR-XZVF apache-flume-1.7. 0
C. Modify the file name under Conf
MV Flume-conf.properties.template flume-conf.properties MV Flume-env.sh.template flume-env.sh
D. Modify the FLUME-ENV.SH environment variable and add the following:
Export java_home=/usr/java/jdk1.8.0_102flume_classpath= "/home/mfz/hadoop-2.7.3/share/hadoop/hdfs/*"
E. New file Hdfs.properties
logagent.sources = Apachelogagent.channels = Filechannellogagent.sinks = Hdfs#sources Config#spooldir to monitor changes to new files in the specified folder, Once a new file has been parsed, the file name completed after parsing the channel is appended with a suffix of *. COMPLATELogAgent.sources.apache.type = SpooldirLogAgent.sources.apache.spoolDir =/tmp/ LogsLogAgent.sources.apache.channels = FileChannelLogAgent.sources.apache.fileHeader = False#sinks ConfigLogAgent.sinks.HDFS.channel = FileChannelLogAgent.sinks.HDFS.type = HdfsLogAgent.sinks.HDFS.hdfs.path = hdfs:/ /master:9000/data/logs/%y-%m-%d/%hlogagent.sinks.hdfs.hdfs.filetype = Datastreamlogagent.sinks.hdfs.hdfs.writeformat=textlogagent.sinks.hdfs.hdfs.fileprefix = FlumeHdfsLogAgent.sinks.HDFS.hdfs.batchSize = 1000logagent.sinks.hdfs.hdfs.rollsize = 10240logagent.sinks.hdfs.hdfs.rollcount = 0logagent.sinks.hdfs.hdfs.rollinterval = 1logagent.sinks.hdfs.hdfs.uselocaltimestamp = True#channels ConfigLogAgent.channels.fileChannel.type = MemoryLogAgent.channels.fileChannel.capacity =10000logagent.channels.filechannel.transactioncapacity = 100
3. Start:
1. Execute in the Apache-flume directory
Bin/flume-ng Agent--conf-file conf/hdfs.properties-c conf/--name logagent-dflume.root.logger=debug,console
Startup error, CTRL + C exit, new monitoring directory/tmp/logs
Mkdir-p/tmp/logs
Reboot:
Start successfully!
4. Verify that:
A. Create another terminal operation;
B. Create a new Test.log directory under the monitoring directory/tmp/logs
VI test.log# content Test Hello World
C. After saving the file, view the previous terminal output as
Look at the picture to get information:
1.test.log has been parsed and the name is modified to Test.log.COMPLETED;
The files and paths generated in the 2.HDFS directory are: hdfs://master:9000/data/logs/2017-03-13/18/flumehdfs.1489399757638.tmp
3. File flumehdfs.1489399757638.tmp has been modified to flumehdfs.1489399757638
Then in the next login Master host, open WebUI, the following actions
or open the master terminal and execute commands under the Hadoop installation package
Bin/hadoop fs-ls-r/data/logs/-Geneva
To view the contents of a file, command:
OK, Finish!
The Flume+hdfs of Big data series