Flume a data source corresponds to multiple channel, multiple sink

Source: Internet
Author: User

I. Overview

1, now has three machines, respectively: HADOOP1,HADOOP2,HADOOP3, to HADOOP1 for the log summary

2, HADOOP1 Summary of the simultaneous output to multiple targets



3, flume a data source corresponding to multiple channel, multiple sink, is configured in the consolidation-accepter.conf file

Ii. deploy flume to collect logs and summary logs

1, running on the HADOOP1

Flume-ng agent--conf./-F Consolidation-accepter.conf-n agent1-dflume.root.logger=info,console

Its script (consolidation-accepter.conf) reads as follows

# Finally, now the we ' ve defined all the components, tell# agent1 which ones we want to Activate.agent1.channels = Ch1 ch2agent1.sources = Source1agent1.sinks = Hdfssink1 Sink2agent1.source.source1.selector.type = Replicatingagent1.source.source1.selector.optional = ch1# Define A memory channel called CH1 on Agent1agent1.channels.ch1.type = Memoryagent1.channels.ch1.capacity = 1000000agent1.channels.ch1.transactioncapacity = 1000000agent1.channels.ch1.keep-alive = 10agent1.channels.ch2.type = memoryagent1.channels.ch2.capacity = 1000000agent1.channels.ch2.transactioncapacity = 100000agent1.channels.ch2.keep-alive = 10# Define an Avro source called Avro-source1 on Agent1 and tell it# to bind to 0.0 .0.0:41414. Connect it to channel ch1.agent1.sources.source1.channels = CH1 Ch2agent1.sources.source1.type = Avroagent1.sources.source1.bind = Conagent1.sources.source1.port = 44444agent1.sources.source1.threads = AA Define A Logger sink that simply logs all events it receives# and connect It to the other end of the same Channel.agent1.sinks.hdfssink1.channel = Ch1agent1.sinks.hdfssink1.type = Hdfsagent1.sink S.hdfssink1.hdfs.path = Hdfs://mycluster/flume/%y-%m-%d/%h%magent1.sinks.hdfssink1.hdfs.fileprefix = S1pa124-consolidation-accesslog-%h-%m-%sagent1.sinks.hdfssink1.hdfs.uselocaltimestamp = Trueagent1.sinks.hdfssink1.hdfs.writeFormat = Textagent1.sinks.hdfssink1.hdfs.fileType = DataStreamagent1.sinks.hdfssink1.hdfs.rollInterval = 1800agent1.sinks.hdfssink1.hdfs.rollsize = 5073741824agent1.sinks.hdfssink1.hdfs.batchsize = 10000agent1.sinks.hdfssink1.hdfs.rollcount = 0agent1.sinks.hdfssink1.hdfs.round = Trueagent1.sinks.hdfssink1.hdfs.roundValue = 60agent1.sinks.hdfssink1.hdfs.roundunit = Minuteagent1.sinks.sink2.type = loggeragent1.sinks.sink2.sink.batchsize= 10000agent1.sinks.sink2.sink.batchtimeout=600000agent1.sinks.sink2.sink.rollinterval = 1000agent1.sinks.sink2.sink.directory=/root/data/flume-logs/agent1.sinks.sink2.sink.filename= Accesslogagent1.sinks.sink2.channEl = CH2 
2. Run the following commands in HADOOP2 and HADOOP3 respectively

Flume-ng agent--conf./  --conf-file collect-send.conf--name agent1

Flume Data transmitter configuration file collect-send.conf content as follows

agent2.sources = Source2agent2.sinks = Sink1agent2.channels = Ch2agent2.sources.source2.type = Execagent2.sources.source2.command = tail-f/root/data/flume.logagent2.sources.source2.channels = ch2#channels Configurationagent2.channels.ch2.type = Memoryagent2.channels.ch2.capacity = 10000agent2.channels.ch2.transactioncapacity = 10000agent2.channels.ch2.keep-alive = 3#sinks Configurationagent2.sinks.sink1.type = Avroagent2.sinks.sink1.hostname= ConsolidationIpAddressagent2.sinks.sink1.port = 44444agent2.sinks.sink1.channel = CH2

Iii. Summary

1, Start flume summary process  flume-ng agent--conf./-F Consolidation-accepter.conf-n agent1-dflume.root.logger=info,console2, Start the Flume acquisition process  flume-ng agent--conf./  --conf-file collect-send.conf--name agent13, configuration parameter description (The following two conditions are the or relationship, That is, trigger when a condition is met) (1) Flushes the data in the channel into sink every half hour, and a new file to store   Agent1.sinks.hdfssink1.hdfs.rollInterval = 1800 (2) when the file size is 5073741824 bytes, another new file is stored   agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824



Flume a data source corresponds to multiple channel, multiple sink

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.