The first is a basic introduction to flume.
Component Name |
function Introduction |
Agent agents |
Run flume using the JVM. Each machine runs an agent, but it can contain multiple sources and sinks in one agent. |
Client clients |
Production data, running on a separate thread. |
SOURCE sources |
Collect data from the client and pass it to the channel. |
Sink receiver |
Collects data from the channel, carries out related operations, and runs on a separate thread. |
Channel channels |
Connecting sources and sinks, this is a bit like a queue. |
Events Event |
The basic data payload of the transmission. |
For now, Flume is supporting multiple source
It is supported to read JMS Message Queuing messages, but does not support reading RABBITMQ, so the flume needs to be developed two times
This is mostly flume how to read data from RABBITMQ
Here's a plugin from git about flume reading data from RABBITMQ
Yes: Https://github.com/gmr/rabbitmq-flume-plugin
There are some descriptions of English, you can see
Environment Introduction
CentOS 7.3 jdk1.8 cdh5.14.0
1. Package the project with MVN and generate two jar packages
2. Because I use the CDH method to install the integrated flume, so throw these two jars under the/usr/lib
If this is a normal installation, you need to copy these two jar packages to Lib under the Flume installation directory.
3. Go to the CDH Management page configuration Agent
Here is the detailed configuration, my side is to write the message directly into the Kafka cluster
Tier1.sources = Source1
Tier1.channels = Channel1
Tier1.sinks = Sink1
Tier1.sources.source1.type = Com.aweber.flume.source.rabbitmq.RabbitMQSource
Tier1.sources.source1.bind = 127.0.0.1
Tier1.sources.source1.port = 5672
Tier1.sources.source1.virtual-host =/
Tier1.sources.source1.username = Guest
Tier1.sources.source1.password = Guest
Tier1.sources.source1.queue = Test
Tier1.sources.source1.prefetchCount = 10
Tier1.sources.source1.channels = Channel1
Tier1.sources.source1.threads = 2
Tier1.sources.source1.interceptors = I1
Tier1.sources.source1.interceptors.i1.type = Org.apache.flume.interceptor.timestampinterceptor$builder
Tier1.sources.source1.interceptors.i1.preserveExisting = True
Tier1.channels.channel1.type = Memory
Tier1.sinks.sink1.channel = Channel1
Tier1.sinks.sink1.type = Org.apache.flume.sink.kafka.KafkaSink
Tier1.sinks.sink1.topic = Flume_out
Tier1.sinks.sink1.brokerList = 127.0.0.1,127.0.0.1:9093,27.0.0.1:9094
Tier1.sinks.sink1.requiredAcks = 1
Tier1.sinks.sink11.batchSize = 20
Configure the update configuration to complete the restart agent
This is the receipt of the RABBITMQ message.
Finished, if the configuration in question can leave a message, I see will reply
Flume reads the RABBITMQ message queue message and writes the message to Kafka