The Flume+hdfs of Big data series

Source: Internet
Author: User
Tags hadoop fs

This article describes Flume (spooling Directory source) + HDFS, and some of the source details in Flume are described in the article http://www.cnblogs.com/cnmenglang/p/6544081.html

1. Material Preparation: apache-flume-1.7.0-bin.tar.gz

2. Configuration steps:

A. Upload to User (LZ user MFZ) directory under Resources

B. Unzip

TAR-XZVF apache-flume-1.7. 0

    

C. Modify the file name under Conf

    

MV Flume-conf.properties.template flume-conf.properties MV Flume-env.sh.template flume-env.sh

D. Modify the FLUME-ENV.SH environment variable and add the following:

Export java_home=/usr/java/jdk1.8.0_102flume_classpath= "/home/mfz/hadoop-2.7.3/share/hadoop/hdfs/*"

E. New file Hdfs.properties

logagent.sources = Apachelogagent.channels = Filechannellogagent.sinks = Hdfs#sources Config#spooldir to monitor changes to new files in the specified folder, Once a new file has been parsed, the file name completed after parsing the channel is appended with a suffix of *. COMPLATELogAgent.sources.apache.type = SpooldirLogAgent.sources.apache.spoolDir =/tmp/ LogsLogAgent.sources.apache.channels = FileChannelLogAgent.sources.apache.fileHeader = False#sinks ConfigLogAgent.sinks.HDFS.channel = FileChannelLogAgent.sinks.HDFS.type = HdfsLogAgent.sinks.HDFS.hdfs.path = hdfs:/ /master:9000/data/logs/%y-%m-%d/%hlogagent.sinks.hdfs.hdfs.filetype = Datastreamlogagent.sinks.hdfs.hdfs.writeformat=textlogagent.sinks.hdfs.hdfs.fileprefix = FlumeHdfsLogAgent.sinks.HDFS.hdfs.batchSize = 1000logagent.sinks.hdfs.hdfs.rollsize = 10240logagent.sinks.hdfs.hdfs.rollcount = 0logagent.sinks.hdfs.hdfs.rollinterval = 1logagent.sinks.hdfs.hdfs.uselocaltimestamp = True#channels ConfigLogAgent.channels.fileChannel.type = MemoryLogAgent.channels.fileChannel.capacity =10000logagent.channels.filechannel.transactioncapacity = 100

3. Start:

1. Execute in the Apache-flume directory

Bin/flume-ng Agent--conf-file  conf/hdfs.properties-c conf/--name logagent-dflume.root.logger=debug,console

Startup error, CTRL + C exit, new monitoring directory/tmp/logs

Mkdir-p/tmp/logs

Reboot:

Start successfully!

4. Verify that:

A. Create another terminal operation;

B. Create a new Test.log directory under the monitoring directory/tmp/logs

VI test.log# content Test Hello World

C. After saving the file, view the previous terminal output as

Look at the picture to get information:

1.test.log has been parsed and the name is modified to Test.log.COMPLETED;

The files and paths generated in the 2.HDFS directory are: hdfs://master:9000/data/logs/2017-03-13/18/flumehdfs.1489399757638.tmp

3. File flumehdfs.1489399757638.tmp has been modified to flumehdfs.1489399757638

Then in the next login Master host, open WebUI, the following actions

or open the master terminal and execute commands under the Hadoop installation package

Bin/hadoop fs-ls-r/data/logs/-Geneva

To view the contents of a file, command:

OK, Finish!

The Flume+hdfs of Big data series

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.