Kafka Combat-flume to Kafka

Source: Internet
Author: User

Original link: Kafka combat-flume to KAFKA1. Overview

In front of you to introduce the entire Kafka project development process, today to share Kafka how to get the data source, that is, Kafka production data. Here are the directories to share today:

    • Data sources
    • Flume to Kafka
    • Data source Loading
    • Preview

Let's start today's shared content.

2. Data sources

The data produced by Kafka is provided by Flume sink, where we need to use the Flume cluster to distribute the agent's log collection to Flume (for real-time computing processing) and HDFS (Offline computing processing) through the Kafka cluster. About the Flume cluster agent deployment, here is not much to do, not clear the students can refer to the "high-availability Hadoop platform-flume ng practical illustrations" in the article, the following to introduce the data source flow chart, as shown:

Here, we use flume as a log collection system to send the collected data to the Kafka middleware for Storm to consume computing in real time, the entire process from each Web node, through the Flume agent agent to collect logs, and then aggregated into the flume cluster, The production process of the data is delivered to the Kafka cluster by the sink of Flume.

3.Flume to Kafka

From the diagram, we have clear the process of data production, below we see how to implement flume to Kafka transport process, below I use a brief diagram description, as shown in:

This expresses the conveying works from Flume to Kafka, and let's look at how to achieve this.

First, as we complete this part of the process, we need to deploy both the flume cluster and the Kafka cluster, and after the deployment of the associated cluster, we configure the flume sink data flow, the configuration information is as follows:

    • The first is to configure the Spooldir method, which reads as follows:
Producer.sources.s.type = Spooldirproducer.sources.s.spooldir =/home/hadoop/dir/logdfs 
    • Of course, Flume's data sender type is also a variety of types, including: Console, Text, HDFS, RPC, etc., here we use the system is Kafka middleware to receive, the configuration content is as follows:
Producer.sinks.r.type =ORG.APACHE.FLUME.PLUGINS.KAFKASINKPRODUCER.SINKS.R.METADATA.BROKER.LIST=DN1:9092,dn2:9092,dn3: 9092producer.sinks.r.partition.key=0producer.sinks.r.partitioner.class= Org.apache.flume.plugins.singlepartitionproducer.sinks.r.serializer.class= Kafka.serializer.stringencoderproducer.sinks.r.request.required.acks=0producer.sinks.r.max.message.size=1000000 Producer.sinks.r.producer.type=sync Producer.sinks.r.custom.encoding=utf-8 Producer.sinks.r.custom.topic.name=test           

In this way, we have configured the data flow to the receiver on the sink side of the flume.

4. Data loading

After the configuration is complete, we begin to load the data, first we produce the logs on the Spooldir side of the flume for flume to collect the logs. Then we go through the Kafka Kafkaoffsetmonitor monitoring tool to monitor the data production situation, and we start loading.

    • Start the ZK cluster, as follows:
Zkserver. SH start

Note: Start on the ZK node, respectively.

    • Start Kafka Cluster
Kafka-server-start. sh config/server.properties &

Enter the same command on the other Kafka node to complete the boot.

    • Start the Kafka monitoring tool
Java-CP kafkaoffsetmonitor-assembly-0.2.  0. Jar  com.quantifind.kafka.offsetapp.OffsetGetterWeb  --zk dn1:2181,dn2:2181,dn3:2181 80891.days         
    • Start Flume Cluster
Flume-ng agent-n producer-c conf-f flume-kafka-sink.properties-dflume.root.logger=error,console

Then, I upload log log in the/home/hadoop/dir/logdfs directory, here I only take a small portion of the log to upload, as shown, indicating the success of the log upload.

5. Preview

Below, we use the Kafka monitoring tool to preview our uploaded log records, and there is no message data generated in Kafka as follows:

    • Launch Kafka cluster, preview for production messages

    • Generate message data in Kafka by uploading logs via Flume

6. Summary

This article to you about the Kafka message generation process, follow-up will be in the Kafka Combat series for everyone to tell the Kafka of the message consumption process, such as a set of processes, here just for the follow-up Kafka combat coding to lay a foundation, let everyone first on the Kafka of the message production has a whole understanding.

Kafka Combat-flume to Kafka (turn)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.