Flume from Kafka Guide data to HDFs

Last Update:2018-07-25 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Flume is a highly available, highly reliable, distributed mass log capture, aggregation, and transmission system provided by Cloudera, Flume supports the customization of various data senders in the log system for data collection, while Flume provides simple processing of data The ability to write to various data-receiving parties (customizable).

Using flume from Kafka data to HDFs

The configuration file is as follows:

Flumetohdfs_agent.sources = Source_from_kafka Flumetohdfs_agent.channels = Mem_channel Flumetohdfs_agent.sinks = Hdfs_ Sink #auto. Commit.enable = true # # Kerberos config # # # #flumetohdfs_agent. Sinks.hdfs_sink.hdfs.kerberosPrincipal = flume/ Datanode2.hdfs.alpha.com@OMGHADOOP.COM #flumetohdfs_agent. Sinks.hdfs_sink.hdfs.kerberosKeytab =/root/ Apache-flume-1.6.0-bin/conf/flume.keytab # For each one of the sources, the type is defined FLUMETOHDFS_AGENT.SOURCES.S Ource_from_kafka.type = Org.apache.flume.source.kafka.KafkaSource Flumetohdfs_agent.sources.source_from_ Kafka.zookeeperconnect = 10.129.142.46:2181,10.166.141.46:2181,10.166.141.47:2181/testkafka Flumetohdfs_ Agent.sources.source_from_kafka.topic = itil_topic_4097 #flumetohdfs_agent. sources.source_from_kafka.batchSize = 10000 Flumetohdfs_agent.sources.source_from_kafka.groupId = flume4097 flumetohdfs_agent.sources.source_from_
Kafka.channels = Mem_channel # The channel can be defined as follows. Flumetohdfs_agent.sinks.hdfs_sink.tyPE = HDFs #flumetohdfs_agent. Sinks.hdfs_sink.filePrefix =%{host} Flumetohdfs_agent.sinks.hdfs_sink.hdfs.path = hdfs:/ /10.49.133.77:9000/data/4097/ds=%y%m%d # # every hour (after gz) flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollSize =
0 Flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollCount = 0 flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollInterval = 3600 Flumetohdfs_agent.sinks.hdfs_sink.hdfs.threadsPoolSize = #flumetohdfs_agent. Sinks.hdfs_sink.hdfs.codeC = gzip # Flumetohdfs_agent.sinks.hdfs_sink.hdfs.fileType = Compressedstream Flumetohdfs_agent.sinks.hdfs_ Sink.hdfs.filetype=datastream Flumetohdfs_agent.sinks.hdfs_sink.hdfs.writeformat=text #Specify the channel the sink
should use Flumetohdfs_agent.sinks.hdfs_sink.channel = Mem_channel # The channel ' s type is defined. Flumetohdfs_agent.channels.mem_channel.type = memory # Other config values specific to each type of channel (sink or sour CE) # can be defined as well # in this case, it specifies the capacity of the memory ChanneL flumetohdfs_agent.channels.mem_channel.capacity = 100000 flumetohdfs_agent.channels.mem_ Channel.transactioncapacity = 10000

Start Agent:

./flume-ng Agent--conf. /conf/-N flumetohdfs_agent-f. /conf/flume-conf-4097.properties

The name of the agent (-n flumetohdfs_agent) must be consistent with the name in the configuration file, the default output HDFs file format is Sequencefile, cannot directly open the browsing, you can set the output format as text:

Flumetohdfs_agent.sinks.hdfs_sink.hdfs.filetype=datastream
Flumetohdfs_agent.sinks.hdfs_sink.hdfs.writeformat=text

You can also set the compression output:

Flumetohdfs_agent.sinks.hdfs_sink.hdfs.codeC = gzip
Flumetohdfs_agent.sinks.hdfs_sink.hdfs.fileType = Compressedstream

See Flume User Guide for more information: http://flume.apache.org/FlumeUserGuide.html

From Kafka to hive:http://geek.csdn.net/news/detail/97941

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Flume from Kafka Guide data to HDFs

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Flume from Kafka Guide data to HDFs

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support