Kafka Combat-flume to Kafka

Last Update:2015-11-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original link: Kafka combat-flume to KAFKA1. Overview

In front of you to introduce the entire Kafka project development process, today to share Kafka how to get the data source, that is, Kafka production data. Here are the directories to share today:

Data sources
Flume to Kafka
Data source Loading
Preview

Let's start today's shared content.

2. Data sources

The data produced by Kafka is provided by Flume sink, where we need to use the Flume cluster to distribute the agent's log collection to Flume (for real-time computing processing) and HDFS (Offline computing processing) through the Kafka cluster. About the Flume cluster agent deployment, here is not much to do, not clear the students can refer to the "high-availability Hadoop platform-flume ng practical illustrations" in the article, the following to introduce the data source flow chart, as shown:

Here, we use flume as a log collection system to send the collected data to the Kafka middleware for Storm to consume computing in real time, the entire process from each Web node, through the Flume agent agent to collect logs, and then aggregated into the flume cluster, The production process of the data is delivered to the Kafka cluster by the sink of Flume.

3.Flume to Kafka

From the diagram, we have clear the process of data production, below we see how to implement flume to Kafka transport process, below I use a brief diagram description, as shown in:

This expresses the conveying works from Flume to Kafka, and let's look at how to achieve this.

First, as we complete this part of the process, we need to deploy both the flume cluster and the Kafka cluster, and after the deployment of the associated cluster, we configure the flume sink data flow, the configuration information is as follows:

The first is to configure the Spooldir method, which reads as follows:

Producer.sources.s.type = Spooldirproducer.sources.s.spooldir =/home/hadoop/dir/logdfs

Of course, Flume's data sender type is also a variety of types, including: Console, Text, HDFS, RPC, etc., here we use the system is Kafka middleware to receive, the configuration content is as follows:

Producer.sinks.r.type =ORG.APACHE.FLUME.PLUGINS.KAFKASINKPRODUCER.SINKS.R.METADATA.BROKER.LIST=DN1:9092,dn2:9092,dn3: 9092producer.sinks.r.partition.key=0producer.sinks.r.partitioner.class= Org.apache.flume.plugins.singlepartitionproducer.sinks.r.serializer.class= Kafka.serializer.stringencoderproducer.sinks.r.request.required.acks=0producer.sinks.r.max.message.size=1000000 Producer.sinks.r.producer.type=sync Producer.sinks.r.custom.encoding=utf-8 Producer.sinks.r.custom.topic.name=test

In this way, we have configured the data flow to the receiver on the sink side of the flume.

4. Data loading

After the configuration is complete, we begin to load the data, first we produce the logs on the Spooldir side of the flume for flume to collect the logs. Then we go through the Kafka Kafkaoffsetmonitor monitoring tool to monitor the data production situation, and we start loading.

Start the ZK cluster, as follows:

Zkserver. SH start

Note: Start on the ZK node, respectively.

Start Kafka Cluster

Kafka-server-start. sh config/server.properties &

Enter the same command on the other Kafka node to complete the boot.

Start the Kafka monitoring tool

Java-CP kafkaoffsetmonitor-assembly-0.2.  0. Jar  com.quantifind.kafka.offsetapp.OffsetGetterWeb  --zk dn1:2181,dn2:2181,dn3:2181 80891.days

Start Flume Cluster

Flume-ng agent-n producer-c conf-f flume-kafka-sink.properties-dflume.root.logger=error,console

Then, I upload log log in the/home/hadoop/dir/logdfs directory, here I only take a small portion of the log to upload, as shown, indicating the success of the log upload.

5. Preview

Below, we use the Kafka monitoring tool to preview our uploaded log records, and there is no message data generated in Kafka as follows:

Launch Kafka cluster, preview for production messages

Generate message data in Kafka by uploading logs via Flume

6. Summary

This article to you about the Kafka message generation process, follow-up will be in the Kafka Combat series for everyone to tell the Kafka of the message consumption process, such as a set of processes, here just for the follow-up Kafka combat coding to lay a foundation, let everyone first on the Kafka of the message production has a whole understanding.

Kafka Combat-flume to Kafka (turn)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Kafka Combat-flume to Kafka

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Kafka Combat-flume to Kafka

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support