I. Overview
1. By building a highly available flume for data collection and storage on HDFs, the frame is composed as follows:
650) this.width=650; "src=" Https://s5.51cto.com/wyfs02/M01/05/CC/wKiom1msukvhD4OfAACMzR0FBDM139.png "title=" 301254248495863 (1). png "alt=" Wkiom1msukvhd4ofaacmzr0fbdm139.png "/>
Second, the configuration agent
1.cat flume-client.properties
#name the components on this agent Declare the name of the source, channel, sink A1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 #Describe/configure the source Declares the type of source to listen on the local port by means of TCP 5140 a1.sources.r1.type = syslogtcp a1.sources.r1.port = 5140 a1.sources.r1.host = localhost a1.sources.r1.channels = c1 #define sinkgroups Here Configure K1, K2 Group Policy, type balanced load mode a1.sinkgroups=g1 a1.sinkgroups.g1.sinks=k1 k2 a1.sinkgroups.g1.processor.type=load_balance a1.sinkgroups.g1.processor.backoff=true a1.sinkgroups.g1.processor.selector=round_robin #define  THE SINK 1 Data Flow, Are sent to two collector machines through the Avro method   A1.SINKS.K1.TYPE=AVRO  A1.SINKS.K1.hostname=hadoop1 a1.sinks.k1.port=5150 #define the sink 2 a1.sinks.k2.type=avro a1.sinks.k2.hostname=hadoop2a1.sinks.k2.port=5150 # Use a channel which buffers events in memory specifies the type of channel to be in memory mode a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactioncapacity = 100 # bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 a1.sinks.k2.channel=c1
#a2和a3的配置和a1相同
Third, configuration Collector
1.cat flume-server.properties
#name the components on this agent declares source, channel, Name of the sink collector1.sources = r1 collector1.channels = c1collector1.sinks = k1 # Describe the source Declares that the type of source is avrocollector1.sources.r1.type = avro collector1.sources.r1.port = 5150 collector1.sources.r1.bind = 0.0.0.0 collector1.sources.r1.channels = c1 # describe channels c1 which buffers events in memory specifies the type of channel to be in memory mode collector1.channels.c1.type = memory collector1.channels.c1.capacity = 1000 collector1.channels.c1.transactioncapacity = 100 # Describe the sink k1 to hdfs Specify sink data flow to HDFSCOLLECTOR1.SINKS.K1.TYPE = HDFS &NBsp;collector1.sinks.k1.channel = c1 collector1.sinks.k1.hdfs.path = hdfs:// master/user/flume/logcollector1.sinks.k1.hdfs.filetype = datastream collector1.sinks.k1.hdfs.writeformat = text collector1.sinks.k1.hdfs.rollinterval = 300 collector1.sinks.k1.hdfs.filePrefix = %Y-%m-%d Collector1.sinks.k1.hdfs.round = true collector1.sinks.k1.hdfs.roundvalue = 5 collector1.sinks.k1.hdfs.roundUnit = minute Collector1.sinks.k1.hdfs.uselocaltimestamp = true
#collector2 Configuration and Collector1 same
Four, start
1. start the fulme-ng on collector
Flume-ng agent-n collector1-c conf-f/usr/local/flume/conf/flume-server.properties-dflume.root.logger=info,console #-N followed by the agent Name in the configuration file
2. start Flume-ng on the agent
Flume-ng agent-n a1-c conf-f/usr/local/flume/conf/flume-client.properties-dflume.root.logger=info,console
V. Testing
[[email protected] ~]# echo "Hello" | nc localhost 5140 #需要安装nc
17/09/03 22:56:58 info source. AVROSOURCE: AVRO SOURCE R1 STARTED.17/09/03 22:59:09 INFO IPC. nettyserver: [id: 0x60551752, /192.168.100.15:34310 => /192.168.100.11:5150] OPEN17/09/03 22:59:09 INFO IPC. nettyserver: [id: 0x60551752, /192.168.100.15:34310 => /192.168.100.11:5150] BOUND: /192.168.100.11:515017/09/03 22:59:09 INFO IPC. nettyserver: [id: 0x60551752, /192.168.100.15:34310 => /192.168.100.11:5150] Connected: /192.168.100.15:3431017/09/03 23:03:54 info hdfs. hdfsdatastream: serializer = text, userawlocalfilesystem = false17/09/03 23:03:54 info hdfs. Bucketwriter: creating hdfs://master/user/flume/log/2017-09-03.1504494234038.tmp
650) this.width=650; "src=" Https://s3.51cto.com/wyfs02/M01/05/CD/wKiom1msw3yjWiPAAAAi14Jyd8g258.png "title=" capture. PNG "alt=" Wkiom1msw3yjwipaaaai14jyd8g258.png "/> Six, summary
Highly available flume-ng generally have two modes: Load_balance and failover. The configuration for this use of Load_balance,failover is as follows:
#set Failovera1.sinkgroups.g1.processor.type = Failovera1.sinkgroups.g1.processor.priority.k1 = 10A1.SINKGROUPS.G1.PROCESSOR.PRIORITY.K2 = 1a1.sinkgroups.g1.processor.maxpenalty = 10000
Some of the commonly used source, channel, and sink types are as follows:
650) this.width=650; "src=" Https://s5.51cto.com/wyfs02/M02/A4/7E/wKioL1msxPPx9ZJgAAB6kCTTeA4732.png "title=" Untitled picture. png "alt=" Wkiol1msxppx9zjgaab6kcttea4732.png "/>
This article is from the "Lullaby" blog, make sure to keep this source http://lullaby.blog.51cto.com/10815696/1962460
High-availability Flume-ng construction