Flume a data source corresponding to multiple channel, multiple Sink_

Flume a data source corresponding to multiple channel, multiple Sink__flume

Last Update:2018-08-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original link: Http://www.tuicool.com/articles/Z73UZf6

The data collected on the HADOOP2 and HADOOP3 are sent to the HADOOP1 cluster and HADOOP1 to a number of different purposes.

I. Overview

1, now there are three machines, respectively: HADOOP1,HADOOP2,HADOOP3, HADOOP1 for the log summary

2, HADOOP1 Summary of the simultaneous output to multiple targets

3, flume a data source corresponding to multiple channel, multiple sink, is configured in the consolidation-accepter.conf file

Ii. deployment of Flume to collect logs and summary logs

1, running on the HADOOP1

Flume-ng agent--conf./F Consolidation-accepter.conf-n Agent1-dflume.root.logger=info,console

The contents of its script (consolidation-accepter.conf) are as follows

# Finally, now we ' ve defined all of our components, tell # agent1 which ones we want to activate. Agent1.channels = ch1 CH2 agent1.sources = source1 agent1.sinks = hdfssink1 sink2 Agent1.source.source1.selector.type = re  Plicating # Define A memory channel called CH1 on agent1 agent1.channels.ch1.type = Memory Agent1.channels.ch1.capacity =

1000000 agent1.channels.ch1.transactionCapacity = 1000000 agent1.channels.ch1.keep-alive = 10 Agent1.channels.ch2.type = Memory Agent1.channels.ch2.capacity = 1000000 agent1.channels.ch2.transactionCapacity =  100000 agent1.channels.ch2.keep-alive = # Define a Avro source called Avro-source1 on Agent1 and tell it # to bind to
0.0.0.0:41414. Connect it to channel CH1.
Agent1.sources.source1.channels = ch1 CH2 Agent1.sources.source1.type = Avro Agent1.sources.source1.bind = con  Agent1.sources.source1.port = 44444 agent1.sources.source1.threads = 5 # Define A logger sink that simply logs all events It receives # and connect it to the other End of the same channel. Agent1.sinks.hdfssink1.channel = ch1 Agent1.sinks.hdfssink1.type = HDFs Agent1.sinks.hdfssink1.hdfs.path = hdfs://
Mycluster/flume/%y-%m-%d/%h%m Agent1.sinks.hdfssink1.hdfs.filePrefix = s1pa124-consolidation-accesslog-%h-%m-%s
Agent1.sinks.hdfssink1.hdfs.useLocalTimeStamp = True Agent1.sinks.hdfssink1.hdfs.writeFormat = Text
Agent1.sinks.hdfssink1.hdfs.fileType = DataStream Agent1.sinks.hdfssink1.hdfs.rollInterval = 1800
Agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824 Agent1.sinks.hdfssink1.hdfs.batchSize = 10000
Agent1.sinks.hdfssink1.hdfs.rollCount = 0 Agent1.sinks.hdfssink1.hdfs.round = True Agent1.sinks.hdfssink1.hdfs.roundValue = Agent1.sinks.hdfssink1.hdfs.roundUnit = Minute Agent1.sinks.sink2.type =
Logger agent1.sinks.sink2.sink.batchsize=10000 agent1.sinks.sink2.sink.batchtimeout=600000
Agent1.sinks.sink2.sink.rollInterval = 1000 agent1.sinks.sink2.sink.directory=/root/data/flume-logs/ Agent1.sinks.sink2.sink.filename=accesslog Agent1.sinks.sink2.Channel = CH2

2, respectively in HADOOP2 and HADOOP3 run the following command

Flume-ng agent--conf./  --conf-file collect-send.conf--name agent2

Flume Data transmitter configuration file collect-send.conf content as follows

agent2.sources = Source2 Agent2.sinks = Sink1 Agent2.channels = CH2 Agent2.sources.source2.type = exec Agent2.sources.sou Rce2.command = tail-f/root/data/flume.log agent2.sources.source2.channels = CH2 #channels configuration Agent2.channel
S.ch2.type = Memory Agent2.channels.ch2.capacity = 10000 Agent2.channels.ch2.transactionCapacity = 10000 agent2.channels.ch2.keep-alive = 3 #sinks Configuration Agent2.sinks.sink1.type = Avro Agent2.sinks.sink1.hostname= Consolidationipaddress Agent2.sinks.sink1.port = 44444 Agent2.sinks.sink1.channel = CH2

1, Start flume summary process
  flume-ng agent--conf./F Consolidation-accepter.conf-n Agent1-dflume.root.logger=info,console
2, start flume acquisition process
  flume-ng agent--conf./  --conf-file collect-send.conf--name
3, configuration parameter description ( The following two conditions are the relationship of or, which is triggered when a condition is satisfied
(1) Flushing the data in the channel to the sink every half hour, and a new file is stored
    Agent1.sinks.hdfssink1.hdfs.rollInterval = 1800
(2) when the file size is 5073741824 bytes, a new file is stored
    Agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824

Installation reference: http://blog.csdn.net/panguoyuan/article/details/39555239

User Manual reference: http://flume.apache.org/FlumeUserGuide.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Flume a data source corresponding to multiple channel, multiple Sink__flume

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Flume a data source corresponding to multiple channel, multiple Sink__flume

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support