Flume Installation & Common proxy configuration

Last Update:2018-07-26 Source: Internet

Author: User

Tags bind zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

first part single node flume configuration

Installation Reference http://flume.apache.org/FlumeUserGuide.html

http://my.oschina.net/leejun2005/blog/288136

Here is a simple introduction, the command to run the agent

$ bin/flume-ng agent-n $agent _name-c conf-f conf/flume-conf.properties.template

1. The single node configuration is as follows

# example.conf:a Single-node Flume Configuration
# Created by Cesar.x 2015/12/14 # Name The components in this

AG ent
a1.sources = R1
a1.sinks = K1
a1.channels = C1

# describe/configure the source
A1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink< C10/>a1.sinks.k1.type = Logger

# Use a channel which buffers events in memory
a1.channels.c1.type = Memory
a1.channels.c1.capacity =
a1.channels.c1.transactionCapacity = $

# Bind The source and sink to the Channe L
a1.sources.r1.channels = C1
A1.sinks.k1.channel = C1

2. Then run the instructions

Bin/flume-ng agent--conf conf--conf-fileconf/myconf/example.conf--name a1-dflume.root.logger=info,console

Ps:-dflume.root.logger=info,console only for debug use, do not production environment mechanically, otherwise a large number of logs will be returned to the terminal ... 3. Then open a shell window again

$ telnet localhost 44444
Trying 127.0.0.1 ...
Connected to Localhost.localdomain (127.0.0.1).
Escape character is ' ^] '.
Hello world! <ENTER>
OK

Question 1

Here we are likely to encounter the problem of not installing Telnet, I here is the Redhat system, if not installed, directly yum–y-install telnet instructions to install.

Question 2

Telnet Connection denied issue

We're looking at 44444-port monitoring.

Netstat-anltup | grep:44444

Reference

Http://www.2cto.com/os/201411/352191.html

Discover that listening is local

Modify the Telnet command

Found that there was no problem with the connection.

Then we enter Hello, carriage return

Go back to that terminal. View

Find information that has been collected from us.

The second part discusses the proxy configuration in more detail

Zookeeper related

We can put the Agent1 configuration file just configured on the ZK. Just after the configuration file is uploaded, we use the following command to start the agent;

Because the official online said, this is an experimental nature, so I will not try.

A schematic of the node in ZK.

-/flume

|-/a1 [Agent config file]

|-/a2 [Agent config file]

Bin/flume-ng agent–conf conf-zzkhost:2181,zkhost1:2181-p/flume–name a1-dflume.root.logger=info,console
 
nohup Bin/flume-ng Agent--confconf--conf-file conf/myconf/flume_colletc_test.conf-n collectormainagent&
nohup Bin/flume-ng Agent--conf conf--conf-file conf/myconf/flume_colletc_test.conf-n collectormainagent &

Flume to HDFs

Configuration file

# Define A memory channel called CH1 on Agent1 # Created by cesar.x 2015/12/14 agent1.channels.ch1.type = Memory agent1.ch Annels.ch1.capacity = 100000 agent1.channels.ch1.transactionCapacity = 100000 agent1.channels.ch1.keep-alive = # Defi Ne an Avro source called Avro-source1 on Agent1 and the IT # to bind to 0.0.0.0:41414.
Connect it to channel CH1. #agent1. sources.avro-source1.channels = Ch1 #agent1. Sources.avro-source1.type = Avro # Agent1.sources.avro-source1.bind = 0.0.0.0 #agent1. sources.avro-source1.port = 41414 # Agent1.sources.avro-source1.threads = 5 #define Source Monitor a file Agent1.sources.avro-source1.type = exec Agent1.sou Rces.avro-source1.shell =/bin/bash-c Agent1.sources.avro-source1.command = tail-n +0-f/usr/local/hadoop/
 
Apache-flume-1.6.0-bin/tmp/id.txt Agent1.sources.avro-source1.channels = ch1 Agent1.sources.avro-source1.threads = 5
# Define A logger sink that simply logs all events it receives # and connect it to the other end of the same channel. AGent1.sinks.log-sink1.channel = ch1 Agent1.sinks.log-sink1.type = HDFs Agent1.sinks.log-sink1.hdfs.path = hdfs:// Mycluster/user/flumetest Agent1.sinks.log-sink1.hdfs.writeformat = Text Agent1.sinks.log-sink1.hdfs.filetype =
DataStream agent1.sinks.log-sink1.hdfs.rollinterval = 0 Agent1.sinks.log-sink1.hdfs.rollsize = 1000000
Agent1.sinks.log-sink1.hdfs.rollcount = 0 Agent1.sinks.log-sink1.hdfs.batchsize = 1000
Agent1.sinks.log-sink1.hdfs.txneventmax = Agent1.sinks.log-sink1.hdfs.calltimeout = 60000  Agent1.sinks.log-sink1.hdfs.appendtimeout = 60000 # Finally, now the we ' ve defined all the components, the tell # Agent1
Which ones we want to activate. Agent1.channels = ch1 Agent1.sources = Avro-source1 Agent1.sinks = Log-sink1

Start

Bin/flume-ng agent--conf conf--conf-fileconf/myconf/flume_directhdfs.conf-n Agent1-dflume.root.logger=info, Console

Then write the content to the file in the configuration file simulation

Echo "Test"  >> 1.txt

Observing the files in HDFs

View content in a file

Multi-agent to HDFs

Let's take two agents as an example.

Angent are arranged on 172.21.99.124 and 172.21.99.125 and 172.21.99.126 respectively, 172.21.99.134 and 172.21.99.135 are arranged on collect

Here are the configurations for each webserver (that is, 3 agents)

Reference

Https://cwiki.apache.org/confluence/display/FLUME/Getting+Started#GettingStarted-flume-ngavro-clientoptions

Prior to this, we need to modify the flume default memory configuration, open flume-env.sh

Export java_opts= "-xms8192m-xmx8192m-xss256k-xmn2g-xx:+useparnewgc-xx:+useconcmarksweepgc-xx:- Usegcoverheadlimit "

Configuration I strongly recommend to calm down to see the official website instructions, written very clearly

Http://flume.apache.org/FlumeUserGuide.html#setting-multi-agent-flow

Here we are, looking directly at the multi-agent how to configure

# list The sources, sinks and channels for the agent

= <Source1> <Source2>

= <Sink1> <Sink2>

= <Channel1> <Channel2>

Based on the above diagram, we need to configure multiple sinks,

Here are the configuration files we deployed on each appliation agent

Here's something to be aware of.

That is, the type of sources, the official website is also very clear.

Here we test the selection of Spooldir.

But when we actually do the project, may not choose the Spooldir way multi agent encounters the problem log appears expected timestamp in the Flume event headers, but it is nul L

The official website has stated that the event must be preceded by a timestamp unless Timestampinterceptor is set to True

So we add

Flume to Kafka

Reference website

1. Issues that may be encountered

1.1 plus 2 channels Flume won't start.

Specific phenomena as shown below

Later try to suggest 1 agents to Kafka separately.

Configured as follows

Be sure to note the contents of the red box, to use hostname, and to configure the hostname on the Collect node

Otherwise it will be an error

After properly configured, again our Kafka cluster, start a Console-consumer

bin/kafka-console-consumer.sh--zookeeper localhost:2181--from-beginning--topic my-replicated-topic

Specific Command Reference website

Http://kafka.apache.org/documentation.html#quickstart

Then we click on the button on the website, or visit the website, we can realize the

JS to Nginxlog to Flume_agent to Flume_collect to Kafka provider, this whole process

And then display it in the consumer window of the Kafka message middleware.

The above print is as Kafka consumer, we click on the site generated by the access record.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More