Objective
This article focuses on springboot integration of Kafka and Storm and some of the problems and solutions encountered in this process.
Knowledge of Kafka and Storm
If you are familiar with Kafka and Storm , this section can be skipped directly! If you are not familiar, you can also look at the blog I wrote earlier. Some of the related blogs are as follows.
Environment installation for Kafka and Storm
Address: http://www.panchengming.com/2018/01/26/pancm70/
Related Uses of Kafka
Address: http://www.panchengming.com/2018/01/28/pancm71/
http://www.panchengming.com/2018/02/08/pancm72/
Related Uses of storm
Address: http://www.panchengming.com/2018/03/16/pancm75/
Springboot integrating Kafka and Storm why use Springboot to integrate Kafka and Storm
In general, using Kafka to consolidate storm can handle most requirements. But in terms of extensibility, it may not be very good. The current mainstream microservices framework Springcloud is based on springboot, so using Springboot to integrate Kafka and Storm can be configured uniformly, and extensibility will be better.
What to do with Springboot integration Kafka and Storm
In general, Kafka and storm consolidation, use Kafka for data transfer, and then use storm to process data in Kafka in real time.
After we join Springboot here, we are doing this, but only by Springboot's unified management of Kafka and Storm.
If this is not a good idea, you can learn from the following simple business scenario:
In the database there is a large number of user data, which many of these user data is not required, that is, dirty data, we need to clean these user data, and then re-deposited in the database, but the requirements of real-time, low latency, and easy to manage.
So here we can use Springboot+kafka+storm to do the corresponding development.
Development preparation
Before we do code development, we need to be clear about what to develop.
In the above business scenario, the need for a lot of data, but we are just simple to develop, that is, to write a simple demo out, can simply implement these functions, so we just need to meet the following conditions:
- Provides an interface for writing user data to Kafka;
- Use storm's spout to get Kafka data and send it to bolt;
- The bolt removes data from users younger than 10 years old and writes to MySQL;
Then we are integrating Springboot, Kafka and Storm according to the above requirements.
The corresponding jar package is required first, so MAVEN relies on the following:
<properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <ja Va.version>1.8</java.version> <springboot.version>1.5.9.RELEASE</springboot.version> &L T;mybatis-spring-boot>1.2.0</mybatis-spring-boot> <mysql-connector>5.1.44</mysql-connector > <slf4j.version>1.7.25</slf4j.version> <logback.version>1.2.3</logback.version> <kafka.version>1.0.0</kafka.version> <storm.version>1.2.1</storm.version> < ;fastjson.version>1.2.41</fastjson.version> <druid>1.1.8</druid> </properties> < ;d ependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> <VERSION>${SPRINGBOOT.VERSION}</VERSION&G T </dependency> <!--Spring Boot Mybatis Dependency--<dependency> <groupid>org.mybatis.spring.boot< ;/groupid> <artifactId>mybatis-spring-boot-starter</artifactId> <version>${myba Tis-spring-boot}</version> </dependency> <!--MySQL Connection driver dependent--<dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>${mysql-connector}</version> </dependency> <dependency> < ;groupid>org.slf4j</groupid> <artifactId>slf4j-api</artifactId> <version> ${slf4j.version}</version> </dependency> <dependency> <groupid>ch.qos.log Back</groupid> <artifactId>logback-classic</artifactId> <VERSION>${LOGBACK.V Ersion}</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> & Lt;artifactid>logback-core</artifactid> <version>${logback.version}</version> </ Dependency> <!--Kafka-<dependency> <groupid>org.apache.kafka</groupi D> <artifactId>kafka_2.12</artifactId> <version>${kafka.version}</version> ; <exclusions> <exclusion> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> </exclusion> <ex Clusion> <groupId>org.slf4j</groupId> <ARTIFACTID>SLF4J-LOG4J12 </artifactId> </exclusion> <exclusion> <groupid>l Og4j</groupid> <artifactId>log4j</artifactId> </exclusion> </ exclusions> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>${kafka.version}</version> </dependency> <dependency> <gro Upid>org.apache.kafka</groupid> <artifactId>kafka-streams</artifactId> <vers Ion>${kafka.version}</version> </dependency> <!--storm related jars--<DEPENDENCY&G T <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> < Version>${storm.version}</version> <!--exclusion related dependencies--<exclusions> & Lt;exclusiOn> <groupId>org.apache.logging.log4j</groupId> <artifactid>log 4j-slf4j-impl</artifactid> </exclusion> <exclusion> < ;groupid>org.apache.logging.log4j</groupid> <artifactId>log4j-1.2-api</artifactId> </exclusion> <exclusion> <GROUPID>ORG.APACHE.LOGGING.L Og4j</groupid> <artifactId>log4j-web</artifactId> </exclusion> <exclusion> <groupId>org.slf4j</groupId> <ARTIFAC Tid>slf4j-log4j12</artifactid> </exclusion> <exclusion> <artifactId>ring-cors</artifactId> <groupId>ring-cors</groupId> </exclusion> </exclusions> <scope>provided</scope> </dependency> <depende Ncy> <groupId>org.apache.storm</groupId> <artifactid>storm-kafka</artifactid > <version>${storm.version}</version> </dependency> <!--Fastjson related jars-- > <dependency> <groupId>com.alibaba</groupId> <artifactid>fastjson </artifactId> <version>${fastjson.version}</version> </dependency> <!- -Druid data Connection pool dependent-<dependency> <groupId>com.alibaba</groupId> <arti factid>druid</artifactid> <version>${druid}</version> </dependency> </d Ependencies>
Once the dependencies have been successfully added, here we add the appropriate configuration.
In application.properties , add the following configuration:
# log logging.config=classpath:logback.xml ## mysql spring.datasource.url=jdbc:mysql://localhost:3306/springBoot2?useUnicode=true&characterEncoding=utf8&allowMultiQueries=true spring.datasource.username=root spring.datasource.password=123456 spring.datasource.driverClassName=com.mysql.jdbc.Driver ## kafka kafka.servers = 192.169.0.23\:9092,192.169.0.24\:9092,192.169.0.25\:9092 kafka.topicName = USER_TOPIC kafka.autoCommit = false kafka.maxPollRecords = 100 kafka.groupId = groupA kafka.commitRule = earliest
Note: The above configuration is only part of the complete configuration can be found in my github.
Database script:
-- springBoot2库的脚本CREATE TABLE `t_user` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增id', `name` varchar(10) DEFAULT NULL COMMENT '姓名', `age` int(2) DEFAULT NULL COMMENT '年龄', PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8
Note: Because here we simply simulate the business scenario, so just create a simple table.
Code writing
Description: Here I only have a few key classes to explain, complete project engineering links can be found at the bottom of the blog.
Before using Springboot to integrate Kafka and Storm, we can first write the relevant code for Kfaka and storm and then integrate it.
The first is the acquisition of the data source, which is the use of spout in storm to pull data from the Kafka.
In the previous storm introduction, we talked about Storm's running process, where spout is a component of Storm's acquisition of data, where we mainly implement the nexttuple approach, Writing code that gets data from Kafka can be used to get the data after storm starts.
The main code for the spout class is as follows:
@Overridepublic void Nexttuple () {for (;;) {try {msglist = consumer.poll (100); if (null! = Msglist &&!msglist.isempty ()) {String msg = ""; List<user> list=new arraylist<user> (); For (consumerrecord<string, string> record:msglist) {//raw data msg = RECORD.V Alue (); if (null = = MSG | | ". Equals (Msg.trim ())) {continue; } try{List.add (Json.parseobject (msg, user.class)); }catch (Exception e) {logger.error ("Data format does not match! Data: {}", msg); Continue }} logger.info ("Data emitted by spout:" +list); Sent to Bolt in This.collector.emit (New Values (json.tojsonstring (list)); Consumer.commitasync (); }else{ TimeUnit.SECONDS.sleep (3); Logger.info ("Not pulled to data ..."); }} catch (Exception e) {logger.error ("Message Queue handling Exception!", E); try {TimeUnit.SECONDS.sleep (10); } catch (Interruptedexception E1) {logger.error ("Pause failed!", E1); } } }}
Note: If the spout send the data when the failure, it will be re-sent!
The above spout class is mainly to transfer the data obtained from the Kafka to the Bolt, and then by the bolt class to process the information, the processing succeeds, writes the database, and then gives the sqout response, avoids the retransmission.
The main method of dealing with the business logic of the Bolt class is execute, and the main method of implementation is also written here. Note that only one bolt is used here, so there is no need to define field for forwarding again.
The implementation class for the code is as follows:
@Override public void execute(Tuple tuple) { String msg=tuple.getStringByField(Constants.FIELD); try{ List<User> listUser =JSON.parseArray(msg,User.class); //移除age小于10的数据 if(listUser!=null&&listUser.size()>0){ Iterator<User> iterator = listUser.iterator(); while (iterator.hasNext()) { User user = iterator.next(); if (user.getAge()<10) { logger.warn("Bolt移除的数据:{}",user); iterator.remove(); } } if(listUser!=null&&listUser.size()>0){ userService.insertBatch(listUser); } } }catch(Exception e){ logger.error("Bolt的数据处理失败!数据:{}",msg,e); } }
After writing spout and bolts, we'll write the main class of Storm.
The main class of storm is to commit the topology (extension), and when submitting the topology, the spout and bolts should be set accordingly. There are two modes of operation of the topology:
One is the local model, which runs with the local storm's jar simulation environment.
LocalCluster cluster = new LocalCluster();cluster.submitTopology("TopologyApp", conf,builder.createTopology());
The other is the remote mode, which is running in the storm cluster.
StormSubmitter.submitTopology(args[0], conf, builder.createTopology());
For convenience, both methods are written and controlled by the args parameter of the main method.
topology Related configuration instructions The comments in the code are written in detail, and I'm not going to say any more here.
The code is as follows:
public void Runstorm (string[] args) {//define a topology Topologybuilder builder = new Topologybuilder (); Set 1 Executeor (threads), default one Builder.setspout (Constants.kafka_spout, New Kafkainsertdataspout (), 1); Shufflegrouping: Represents a random grouping//setting of 1 Executeor (threads), and two task Builder.setbolt (Constants.insert_bolt, New Insertbolt (), 1). Setnumtasks (1). shufflegrouping (Constants.kafka_spout); Config conf = new config (); Set a respondent Conf.setnumackers (1); Set up a work conf.setnumworkers (1); When a try {//parameter is present, the job is submitted to the cluster, and the first parameter is treated as the topology name//If there is no argument, the local commit if (args! = null && args.length ; 0) {logger.info ("Running remote Mode"); Stormsubmitter.submittopology (Args[0], conf, builder.createtopology ()); } else {//start local mode Logger.info ("Run local mode"); Localcluster cluster = new Localcluster (); Cluster.submittopology ("Topologyapp", conf, Builder.createtopology ()); }} catch (Exception e) {LOgger.error ("Storm start failed! program exits!", e); System.exit (1); } logger.info ("Storm starts successfully ..."); }
Well, after writing the Kafka and storm-related code, we'll do the integration with Springboot!
Before and springboot integration, we must first solve a few problems.
1 How to Submit Storm's topolgy in the Springboot program?
Storm is by submitting topolgy to determine how to start, generally used to run the main method to start, but the Springboot boot method is generally initiated by the Main method. So how do we solve it?
- Workaround: Write Storm's topology in the main class that Springboot starts and start with the Springboot startup.
- Experimental results: Can be started together (it is also possible). But then comes the next question, and the bolt and spout classes can't use spring annotations.
2 How do I get the bolt and spout classes to use spring annotations?
- Solution: In the understanding that the spout and bolt classes are instantiated by the Nimbus end, and then serialized to supervisor, and then back-serialized, it is not possible to use annotations, so here you can change the idea, since cannot use annotations, Then it would be nice to get spring beans dynamically.
- Experimental results: After using the method of dynamic fetch bean, Storm can be started successfully.
3. Sometimes it starts normally, sometimes it doesn't start, and the dynamic Bean doesn't get it?
- Solution: After solving the problem of 1, 2, sometimes problem 3, find a long time to find, is because before the springboot to join the hot deployment, removed after it did not appear ....
The top three questions I encountered at the time of integration, where the solution seems to be feasible at the moment, perhaps the problem may be caused by other reasons, but after this integration, there are no other problems. If the above problems and solutions are inappropriate, please criticize!
After solving the above problem, we go back to the code this block.
Where the entry of the program, that is, the code of the main class, is integrated as follows:
@SpringBootApplicationpublic class Application{ public static void main(String[] args) { // 启动嵌入式的 Tomcat 并初始化 Spring 环境及其各 Spring 组件 ConfigurableApplicationContext context = SpringApplication.run(Application.class, args); GetSpringBean springBean=new GetSpringBean(); springBean.setApplicationContext(context); TopologyApp app = context.getBean(TopologyApp.class); app.runStorm(args); } }
The code for the dynamic fetch Bean is as follows:
public class GetSpringBean implements ApplicationContextAware{ private static ApplicationContext context; public static Object getBean(String name) { return context.getBean(name); } public static <T> T getBean(Class<T> c) { return context.getBean(c); } @Override public void setApplicationContext(ApplicationContext applicationContext) throws BeansException { if(applicationContext!=null){ context = applicationContext; } }}
The main code of the introduction is here, as for the other, the basic is the same as before.
Test results
After successfully starting the program, we call the interface to add a few additional data to Kafka
New request:
POST http://localhost:8087/api/user{"name":"张三","age":20}{"name":"李四","age":10}{"name":"王五","age":5}
After the new success, we can use the Xshell tool to view the data in the Kafka cluster.
Input:**kafka-console-consumer.sh --zookeeper master:2181 --topic USER_TOPIC --from-beginning**
You can then see the following output results.
The above also indicates that the data was successfully written to the Kafka.
Because it is real-time data from Kafka, we can also view printed statements from the console.
Console output:
INFO com.pancm.storm.spout.KafkaInsertDataSpout - Spout发射的数据:[{"age":5,"name":"王五"}, {"age":10,"name":"李四"}, {"age":20,"name":"张三"}] WARN com.pancm.storm.bolt.InsertBolt - Bolt移除的数据:{"age":5,"name":"王五"} INFO com.alibaba.druid.pool.DruidDataSource - {dataSource-1} inited DEBUG com.pancm.dao.UserDao.insertBatch - ==> Preparing: insert into t_user (name,age) values (?,?) , (?,?) DEBUG com.pancm.dao.UserDao.insertBatch - ==> Parameters: 李四(String), 10(Integer), 张三(String), 20(Integer) DEBUG com.pancm.dao.UserDao.insertBatch - <== Updates: 2 INFO com.pancm.service.impl.UserServiceImpl - 批量新增2条数据成功!
The process and results of processing can be seen successfully in the console.
Then we can also make the database all the data query through the interface.
Query Request:
GET http://localhost:8087/api/user
return Result:
[{"id":1,"name":"李四","age":10},{"id":2,"name":"张三","age":20}]
The results from the test returned in the above code are clearly in line with our expectations.
Conclusion
About Springboot Integration Kafka and Storm will be over for the time being. This article simply introduces these related uses, which may be more complex in practical applications. If there are better ideas and suggestions, welcome message to discuss!
Springboot integration of Kafka and Storm engineering I put it on GitHub, and if it feels good, give it to a star.
Gihub Address: Https://github.com/xuwujing/springBoot-study
Yes, there are also Kafka integrated Storm Engineering, also on my GitHub.
Address: Https://github.com/xuwujing/kafka-study
To this end of this article, thank you for reading.
Copyright Notice:
Empty Realm
Blog Park Source: http://www.cnblogs.com/xuwujing
CSDN Source: HTTP://BLOG.CSDN.NET/QAZWSXPCM
Personal blog Source: http://www.panchengming.com
Original is not easy, reproduced please indicate the source, thank you!
Springboot integration of Kafka and Storm