Original link: Http://www.tuicool.com/articles/Z73UZf6
The data collected on the HADOOP2 and HADOOP3 are sent to the HADOOP1 cluster and HADOOP1 to a number of different purposes.
I. Overview
1, now there are three machines, respectively: HADOOP1,HADOOP2,HADOOP3, HADOOP1 for the log summary
2, HADOOP1 Summary of the simultaneous output to multiple targets
3, flume a data source corresponding to multiple channel, multiple sink, is configured in the consolidation-accepter.conf file
Ii. deployment of Flume to collect logs and summary logs
1, running on the HADOOP1
Flume-ng agent--conf./F Consolidation-accepter.conf-n Agent1-dflume.root.logger=info,console
The contents of its script (consolidation-accepter.conf) are as follows
# Finally, now we ' ve defined all of our components, tell # agent1 which ones we want to activate. Agent1.channels = ch1 CH2 agent1.sources = source1 agent1.sinks = hdfssink1 sink2 Agent1.source.source1.selector.type = re Plicating # Define A memory channel called CH1 on agent1 agent1.channels.ch1.type = Memory Agent1.channels.ch1.capacity =
1000000 agent1.channels.ch1.transactionCapacity = 1000000 agent1.channels.ch1.keep-alive = 10 Agent1.channels.ch2.type = Memory Agent1.channels.ch2.capacity = 1000000 agent1.channels.ch2.transactionCapacity = 100000 agent1.channels.ch2.keep-alive = # Define a Avro source called Avro-source1 on Agent1 and tell it # to bind to
0.0.0.0:41414. Connect it to channel CH1.
Agent1.sources.source1.channels = ch1 CH2 Agent1.sources.source1.type = Avro Agent1.sources.source1.bind = con Agent1.sources.source1.port = 44444 agent1.sources.source1.threads = 5 # Define A logger sink that simply logs all events It receives # and connect it to the other End of the same channel. Agent1.sinks.hdfssink1.channel = ch1 Agent1.sinks.hdfssink1.type = HDFs Agent1.sinks.hdfssink1.hdfs.path = hdfs://
Mycluster/flume/%y-%m-%d/%h%m Agent1.sinks.hdfssink1.hdfs.filePrefix = s1pa124-consolidation-accesslog-%h-%m-%s
Agent1.sinks.hdfssink1.hdfs.useLocalTimeStamp = True Agent1.sinks.hdfssink1.hdfs.writeFormat = Text
Agent1.sinks.hdfssink1.hdfs.fileType = DataStream Agent1.sinks.hdfssink1.hdfs.rollInterval = 1800
Agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824 Agent1.sinks.hdfssink1.hdfs.batchSize = 10000
Agent1.sinks.hdfssink1.hdfs.rollCount = 0 Agent1.sinks.hdfssink1.hdfs.round = True Agent1.sinks.hdfssink1.hdfs.roundValue = Agent1.sinks.hdfssink1.hdfs.roundUnit = Minute Agent1.sinks.sink2.type =
Logger agent1.sinks.sink2.sink.batchsize=10000 agent1.sinks.sink2.sink.batchtimeout=600000
Agent1.sinks.sink2.sink.rollInterval = 1000 agent1.sinks.sink2.sink.directory=/root/data/flume-logs/ Agent1.sinks.sink2.sink.filename=accesslog Agent1.sinks.sink2.Channel = CH2
2, respectively in HADOOP2 and HADOOP3 run the following command
Flume-ng agent--conf./ --conf-file collect-send.conf--name agent2
Flume Data transmitter configuration file collect-send.conf content as follows
agent2.sources = Source2 Agent2.sinks = Sink1 Agent2.channels = CH2 Agent2.sources.source2.type = exec Agent2.sources.sou Rce2.command = tail-f/root/data/flume.log agent2.sources.source2.channels = CH2 #channels configuration Agent2.channel
S.ch2.type = Memory Agent2.channels.ch2.capacity = 10000 Agent2.channels.ch2.transactionCapacity = 10000 agent2.channels.ch2.keep-alive = 3 #sinks Configuration Agent2.sinks.sink1.type = Avro Agent2.sinks.sink1.hostname= Consolidationipaddress Agent2.sinks.sink1.port = 44444 Agent2.sinks.sink1.channel = CH2
1, Start flume summary process
flume-ng agent--conf./F Consolidation-accepter.conf-n Agent1-dflume.root.logger=info,console
2, start flume acquisition process
flume-ng agent--conf./ --conf-file collect-send.conf--name
3, configuration parameter description ( The following two conditions are the relationship of or, which is triggered when a condition is satisfied
(1) Flushing the data in the channel to the sink every half hour, and a new file is stored
Agent1.sinks.hdfssink1.hdfs.rollInterval = 1800
(2) when the file size is 5073741824 bytes, a new file is stored
Agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824
Installation reference: http://blog.csdn.net/panguoyuan/article/details/39555239
User Manual reference: http://flume.apache.org/FlumeUserGuide.html