Kafka-Storm integrated deployment

Last Update:2016-03-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Kafka-Storm integrated deployment
Preface

The main component of Distributed Real-time computing is Apache Storm Based on stream computing. The data source of real-time computing comes from Kafka in the basic data input component, how to pass the message data of Kafka to Storm is discussed in this article.

0. Prepare materials

Normal and stable Kafka cluster (Version: Kafka 0.8.2)
Normal and stable Storm cluster (Version: Storm 0.9.8)
Maven 3.x

1. Storm Topology Project

Storm jobs are called Topology. To process real-time computing tasks, you need to create a Storm Topology project. Due to the message transmission mode of Kafka, the so-called Kafka-Storm integration deployment actually requires a Spout interface to receive Kafka messages. Fortunately, reliable KafkaSpout has been built into the latest Storm official version. You do not need to write it manually. You only need to configure KafkaSpout as the input data source of Topology.

2. Maven Configuration

This project is built on Maven.

Main dependencies to be configured

        <dependency>            <groupId>org.apache.storm</groupId>            <artifactId>storm-kafka</artifactId>            <version>0.9.3</version>            <scope>provided</scope>        </dependency>        <dependency>            <groupId>org.apache.storm</groupId>            <artifactId>storm-core</artifactId>            <version>0.9.3</version>            <scope>provided</scope>        </dependency>        <dependency>            <groupId>org.apache.kafka</groupId>            <artifactId>kafka_2.10</artifactId>            <version>0.8.2.1</version>            <scope>provided</scope>        </dependency>

Note: The dependent scope here is "provided"

Maven compilation Configuration

    <build>        <finalName>storm-kafka-topology</finalName>        <resources>            <resource>                <directory>src/main/resources</directory>            </resource>        </resources>        <plugins>            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-compiler-plugin</artifactId>                <version>3.1</version>                <configuration>                    <source>1.7</source>                    <target>1.7</target>                </configuration>            </plugin>            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-shade-plugin</artifactId>                <executions>                    <execution>                        <phase>package</phase>                        <goals>                            <goal>shade</goal>                        </goals>                    </execution>                </executions>                <configuration>                    <finalName>${project.artifactId}-${project.version}-shade</finalName>                    <filters>                        <filter>                            <artifact>*:*</artifact>                            <excludes>                                <exclude>META-INF/*.SF</exclude>                                <exclude>META-INF/*.DSA</exclude>                                <exclude>META-INF/*.RSA</exclude>                            </excludes>                        </filter>                    </filters>                    <artifactSet>                        <excludes>                            <exclude>log4j:log4j:jar:</exclude>                        </excludes>                    </artifactSet>                    <transformers>                        <transformer                            implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />                        <transformer                            implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">                            <mainClass>storm.kafka.example.StormTopology</mainClass>                        </transformer>                    </transformers>                </configuration>            </plugin>        </plugins>    </build>

3. Implement Topology

The following is a simple example of Topology (Java version ).

 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334353637383940

Public class StormTopology {// Topology close command (message control passed through external) public static boolean shutdown = false; public static void main (String [] args) {// register ZooKeeper host BrokerHosts brokerHosts = new ZkHosts ("hd182: 2181, hd185: 2181, hd128: 2181 "); // The Name Of The received Kafka topic String topic = "flumeTopic"; // The registered node name of ZooKeeper (Note: Add "/"; otherwise, ZooKeeper will not be recognized) string zkRoot = "/kafkastorm"; // configure Spout String spoutId = "MyKafka"; SpoutConfig spoutConfig = new SpoutConfig (brokerHosts, topic, zkRoot, spoutId); // configure Scheme (optional) spoutConfig. scheme = new SchemeAsMultiScheme (new SimpleMessageScheme (); KafkaSpout kafkaSpout = new KafkaSpout (spoutConfig); TopologyBuilder builder = new TopologyBuilder (); builder. setSpout ("kafka-spout", kafkaSpout); builder. setBolt ("operator", new OperatorBolt ()). shuffleGrouping ("kafka -Spout "); Config conf = new Config (); conf. setDebug (true); conf. setNumWorkers (3); // The test environment adopts the local mode LocalCluster cluster = new LocalCluster (); cluster. submitTopology ("test", conf, builder. createTopology (); while (! Shutdown) {Utils. sleep (100);} cluster. killTopology ("test"); cluster. shutdown ();}}

Because a KafkaSpout can only receive message data of a specified topic, you must configure the number of spouts according to business requirements in the Topology Implementation of the actual production environment.

4. Necessary dependent packages

Because the Topology project dependencies are "provided" scope, You need to copy the dependent jar packages involved to the lib folder in the Storm installation directory, including:

 
 
  
  kafka_2.10-0.8.2.1.jar
  
  storm-kafka-0.9.3.jar
  
  scala-library-2.10.4.jar
  
  zookeeper-3.4.6.jar
  
  curator-client-2.6.0.jar
  
  curator-framework-2.6.0.jar
  
  curator-recipes-2.6.0.jar
  
  guava-16.0.1.jar
  
  metrics-core-2.2.0.jar

5. Launch and run

Submit tasks to the Storm cluster and observe the data output results. In addition, you can view the running status of the internal components of Topology on the Storm UI (Cluster mode is required ).

Kafka architecture design of the distributed publish/subscribe message system

Apache Kafka code example

Apache Kafka tutorial notes

Principles and features of Apache kafka (0.8 V)

Kafka deployment and code instance

Introduction to Kafka and establishment of Cluster Environment

For details about Kafka, click here
Kafka: click here

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Kafka-Storm integrated deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Kafka-Storm integrated deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support