Flume Installation & Common proxy configuration

Source: Internet
Author: User
Tags bind zookeeper
first part single node flume configuration

Installation Reference http://flume.apache.org/FlumeUserGuide.html

http://my.oschina.net/leejun2005/blog/288136

Here is a simple introduction, the command to run the agent

$ bin/flume-ng agent-n $agent _name-c conf-f conf/flume-conf.properties.template

1. The single node configuration is as follows

# example.conf:a Single-node Flume Configuration
# Created by Cesar.x 2015/12/14 # Name The components in this

AG ent
a1.sources = R1
a1.sinks = K1
a1.channels = C1

# describe/configure the source
A1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink< C10/>a1.sinks.k1.type = Logger

# Use a channel which buffers events in memory
a1.channels.c1.type = Memory
a1.channels.c1.capacity =
a1.channels.c1.transactionCapacity = $

# Bind The source and sink to the Channe L
a1.sources.r1.channels = C1
A1.sinks.k1.channel = C1

2. Then run the instructions

Bin/flume-ng agent--conf conf--conf-fileconf/myconf/example.conf--name a1-dflume.root.logger=info,console

Ps:-dflume.root.logger=info,console only for debug use, do not production environment mechanically, otherwise a large number of logs will be returned to the terminal ... 3. Then open a shell window again

$ telnet localhost 44444
Trying 127.0.0.1 ...
Connected to Localhost.localdomain (127.0.0.1).
Escape character is ' ^] '.
Hello world! <ENTER>
OK


Question 1

Here we are likely to encounter the problem of not installing Telnet, I here is the Redhat system, if not installed, directly yum–y-install telnet instructions to install.

Question 2

Telnet Connection denied issue

We're looking at 44444-port monitoring.

Netstat-anltup | grep:44444

Reference

Http://www.2cto.com/os/201411/352191.html

Discover that listening is local

Modify the Telnet command


Found that there was no problem with the connection.

Then we enter Hello, carriage return


Go back to that terminal. View

Find information that has been collected from us.

The second part discusses the proxy configuration in more detail

Zookeeper related

We can put the Agent1 configuration file just configured on the ZK. Just after the configuration file is uploaded, we use the following command to start the agent;

Because the official online said, this is an experimental nature, so I will not try.

A schematic of the node in ZK.

-/flume
|-/a1 [Agent config file]
|-/a2 [Agent config file]

Bin/flume-ng agent–conf conf-zzkhost:2181,zkhost1:2181-p/flume–name a1-dflume.root.logger=info,console
 
nohup Bin/flume-ng Agent--confconf--conf-file conf/myconf/flume_colletc_test.conf-n collectormainagent&
nohup Bin/flume-ng Agent--conf conf--conf-file conf/myconf/flume_colletc_test.conf-n collectormainagent &


Flume to HDFs

Configuration file

# Define A memory channel called CH1 on Agent1 # Created by cesar.x 2015/12/14 agent1.channels.ch1.type = Memory agent1.ch Annels.ch1.capacity = 100000 agent1.channels.ch1.transactionCapacity = 100000 agent1.channels.ch1.keep-alive = # Defi Ne an Avro source called Avro-source1 on Agent1 and the IT # to bind to 0.0.0.0:41414.
Connect it to channel CH1. #agent1. sources.avro-source1.channels = Ch1 #agent1. Sources.avro-source1.type = Avro # Agent1.sources.avro-source1.bind = 0.0.0.0 #agent1. sources.avro-source1.port = 41414 # Agent1.sources.avro-source1.threads = 5 #define Source Monitor a file Agent1.sources.avro-source1.type = exec Agent1.sou Rces.avro-source1.shell =/bin/bash-c Agent1.sources.avro-source1.command = tail-n +0-f/usr/local/hadoop/
 
Apache-flume-1.6.0-bin/tmp/id.txt Agent1.sources.avro-source1.channels = ch1 Agent1.sources.avro-source1.threads = 5
# Define A logger sink that simply logs all events it receives # and connect it to the other end of the same channel. AGent1.sinks.log-sink1.channel = ch1 Agent1.sinks.log-sink1.type = HDFs Agent1.sinks.log-sink1.hdfs.path = hdfs:// Mycluster/user/flumetest Agent1.sinks.log-sink1.hdfs.writeformat = Text Agent1.sinks.log-sink1.hdfs.filetype =
DataStream agent1.sinks.log-sink1.hdfs.rollinterval = 0 Agent1.sinks.log-sink1.hdfs.rollsize = 1000000
Agent1.sinks.log-sink1.hdfs.rollcount = 0 Agent1.sinks.log-sink1.hdfs.batchsize = 1000
Agent1.sinks.log-sink1.hdfs.txneventmax = Agent1.sinks.log-sink1.hdfs.calltimeout = 60000  Agent1.sinks.log-sink1.hdfs.appendtimeout = 60000 # Finally, now the we ' ve defined all the components, the tell # Agent1
Which ones we want to activate. Agent1.channels = ch1 Agent1.sources = Avro-source1 Agent1.sinks = Log-sink1


Start

Bin/flume-ng agent--conf conf--conf-fileconf/myconf/flume_directhdfs.conf-n Agent1-dflume.root.logger=info, Console

Then write the content to the file in the configuration file simulation

Echo "Test"  >> 1.txt


Observing the files in HDFs

View content in a file

Multi-agent to HDFs

Let's take two agents as an example.

Angent are arranged on 172.21.99.124 and 172.21.99.125 and 172.21.99.126 respectively, 172.21.99.134 and 172.21.99.135 are arranged on collect

Here are the configurations for each webserver (that is, 3 agents)

Reference

Https://cwiki.apache.org/confluence/display/FLUME/Getting+Started#GettingStarted-flume-ngavro-clientoptions

Prior to this, we need to modify the flume default memory configuration, open flume-env.sh

Export java_opts= "-xms8192m-xmx8192m-xss256k-xmn2g-xx:+useparnewgc-xx:+useconcmarksweepgc-xx:- Usegcoverheadlimit "

Configuration I strongly recommend to calm down to see the official website instructions, written very clearly

Http://flume.apache.org/FlumeUserGuide.html#setting-multi-agent-flow

Here we are, looking directly at the multi-agent how to configure

# list The sources, sinks and channels for the agent
= <Source1> <Source2>
= <Sink1> <Sink2>
= <Channel1> <Channel2>

Based on the above diagram, we need to configure multiple sinks,

Here are the configuration files we deployed on each appliation agent

Here's something to be aware of.


That is, the type of sources, the official website is also very clear.

Here we test the selection of Spooldir.

But when we actually do the project, may not choose the Spooldir way multi agent encounters the problem log appears expected timestamp in the Flume event headers, but it is nul L

The official website has stated that the event must be preceded by a timestamp unless Timestampinterceptor is set to True

So we add




Flume to Kafka

Reference website


1. Issues that may be encountered

1.1 plus 2 channels Flume won't start.

Specific phenomena as shown below

Later try to suggest 1 agents to Kafka separately.

Configured as follows


Be sure to note the contents of the red box, to use hostname, and to configure the hostname on the Collect node

Otherwise it will be an error

After properly configured, again our Kafka cluster, start a Console-consumer

bin/kafka-console-consumer.sh--zookeeper localhost:2181--from-beginning--topic my-replicated-topic


Specific Command Reference website

Http://kafka.apache.org/documentation.html#quickstart

Then we click on the button on the website, or visit the website, we can realize the

JS to Nginxlog to Flume_agent to Flume_collect to Kafka provider, this whole process

And then display it in the consumer window of the Kafka message middleware.


The above print is as Kafka consumer, we click on the site generated by the access record.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.