Kafka Combat-kafka to storm

Source: Internet
Author: User

1. Overview

In the "Kafka combat-flume to Kafka" in the article to share the Kafka of the data source production, today for everyone to introduce how to real-time consumption Kafka data. This uses the real-time computed model--storm. Here are the main things to share today, as shown below:

    • Data consumption
    • Storm calculation
    • Preview

Next, we start sharing today's content.

2. Data consumption

Kafka data consumption, is to be consumed by storm, through kafkaspout data to storm, and then let storm install business needs to do real-time processing of accepted data, the following is to introduce the flow chart of data consumption, as shown in:

As you can see from the graph, Storm gets the data in the Kafka cluster through Kafkaspout, and after storm processing, the results are persisted to the DB Library.

3.Storm calculation

Then, we use storm to calculate, here need physical examination to build a good Storm cluster, if not set up deployment cluster, you can refer to my written "Kafka combat-storm Cluster". Here is not much to do the construction of the process, the following to introduce the implementation of this part of the code, about Kafkaspout code as follows:

    • Kafkaspout class:
 PackageCn.hadoop.hdfs.storm;ImportJava.util.HashMap;Importjava.util.List;ImportJava.util.Map;Importjava.util.Properties;ImportOrg.slf4j.Logger;Importorg.slf4j.LoggerFactory;Importcn.hadoop.hdfs.conf.ConfigureAPI.KafkaProperties;ImportKafka.consumer.Consumer;ImportKafka.consumer.ConsumerConfig;ImportKafka.consumer.ConsumerIterator;ImportKafka.consumer.KafkaStream;ImportKafka.javaapi.consumer.ConsumerConnector;ImportBacktype.storm.spout.SpoutOutputCollector;ImportBacktype.storm.task.TopologyContext;Importbacktype.storm.topology.IRichSpout;ImportBacktype.storm.topology.OutputFieldsDeclarer;ImportBacktype.storm.tuple.Fields;Importbacktype.storm.tuple.Values;/*** @Date June, * * @Author Dengjie * * @Note Data sources using Kafkaspout to consume Kafka*/ Public classKafkaspoutImplementsIrichspout {/**     *      */    Private Static Final LongSerialversionuid = -7107773519958260350l; Private Static FinalLogger Logger = Loggerfactory.getlogger (kafkaspout.class);    Spoutoutputcollector collector; PrivateConsumerconnector Consumer; PrivateString topic; Private Staticconsumerconfig Createconsumerconfig () {Properties props=NewProperties (); Props.put ("Zookeeper.connect", KAFKAPROPERTIES.ZK); Props.put ("Group.id", kafkaproperties.group_id); Props.put ("Zookeeper.session.timeout.ms", "40000"); Props.put ("Zookeeper.sync.time.ms", "200"); Props.put ("Auto.commit.interval.ms", "1000"); return Newconsumerconfig (props); }     Publickafkaspout (String topic) { This. Topic =topic; }     Public voidOpen (Map conf, topologycontext context, Spoutoutputcollector collector) { This. Collector =collector; }     Public voidClose () {//TODO auto-generated Method Stub    }     Public voidActivate () { This. Consumer =Consumer.createjavaconsumerconnector (Createconsumerconfig ()); Map<string, integer> topickmap =NewHashmap<string, integer>(); Topickmap.put (Topic,NewInteger (1)); Map<string, list<kafkastream<byte[],byte[]>>> Streammap =Consumer.createmessagestreams (TOPICKMAP); Kafkastream<byte[],byte[]> stream = Streammap.get (topic). Get (0); Consumeriterator<byte[],byte[]> it =Stream.iterator ();  while(It.hasnext ()) {String value=NewString (It.next (). message ()); Logger.info ("(consumer) ==>" +value); Collector.emit (NewValues (value), value); }    }     Public voidDeactivate () {//TODO auto-generated Method Stub    }     Public voidnexttuple () {//TODO auto-generated Method Stub    }     Public voidack (Object msgId) {//TODO auto-generated Method Stub    }     Public voidfail (Object msgId) {//TODO auto-generated Method Stub    }     Public voiddeclareoutputfields (Outputfieldsdeclarer declarer) {Declarer.declare (NewFields ("Kafkaspout")); }     PublicMap<string, object>getcomponentconfiguration () {//TODO auto-generated Method Stub        return NULL; }}
    • Kafkatopology class:
 Packagecn.hadoop.hdfs.storm.client;Importcn.hadoop.hdfs.storm.FileBlots;Importcn.hadoop.hdfs.storm.KafkaSpout;Importcn.hadoop.hdfs.storm.WordsCounterBlots;ImportBacktype.storm.Config;ImportBacktype.storm.LocalCluster;ImportBacktype.storm.StormSubmitter;ImportBacktype.storm.topology.TopologyBuilder;ImportBacktype.storm.tuple.Fields;/*** @Date June, * * @Author Dengjie * * @Note kafkatopology Task*/ Public classKafkatopology { Public Static voidMain (string[] args) {Topologybuilder builder=NewTopologybuilder (); Builder.setspout ("Testgroup",NewKafkaspout ("Test")); Builder.setbolt ("File-blots",NewFileblots ()). Shufflegrouping ("Testgroup"); Builder.setbolt ("Words-counter",NewWordscounterblots (), 2). fieldsgrouping ("File-blots",NewFields ("Words")); Config config=NewConfig (); Config.setdebug (true); if(Args! =NULL&& args.length > 0) {            //Online Commit topologyConfig.put (Config.nimbus_host, args[0]); Config.setnumworkers (3); Try{Stormsubmitter.submittopologywithprogressbar (kafkatopology).class. Getsimplename (), config, builder.createtopology ()); } Catch(Exception e) {e.printstacktrace (); }        } Else {            //Local commit JarLocalcluster local =NewLocalcluster (); Local.submittopology ("Counter", config, builder.createtopology ()); Try{Thread.Sleep (60000); } Catch(interruptedexception e) {e.printstacktrace ();        } local.shutdown (); }    }}
4. Preview

First, we start the Kafka cluster and do not currently produce any messages, as shown in:

Next, we start the flume cluster, start collecting log information, and transfer the data to the Kafka cluster as shown in:

Next, we launch the Storm UI to see the health of the storm-committed task, as shown in:

Finally, the results of the statistics are persisted to the Redis or MySQL db, as shown in the following example:

5. Summary

Here to share the data of the consumption process, and give a preview of the results of persistent, about the details of the persistence, there is a separate blog will be described in detail, to share the process, here everyone familiar with the next process, preview the results can be.

6. Concluding remarks

This blog is to share with you here, if you study in the process of learning what is the problem, you can add groups to discuss or send e-mail to me, I will do my best to answer for you, with June encouragement!

Kafka Combat-kafka to storm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.