Java Implementation Kafka Consumer example

Last Update:2015-06-08 Source: Internet

Author: User

Tags zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Using Java to implement Kafka consumers

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465 66676869707172737475767778798081828384858687888990919293949596979899 package com.lisg.kafkatest;import java.util.HashMap;import java.util.List;import java.util.Map;import java.util.Properties;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;import java.util.concurrent.TimeUnit;import kafka.consumer.Consumer;import kafka.consumer.ConsumerConfig;import kafka.consumer.ConsumerIterator;import kafka.consumer.KafkaStream;import kafka.javaapi.consumer.ConsumerConnector;/** * java实现Kafka消费者的示例 * @author lisg * */public class KafkaConsumer { private static final String TOPIC = "test"; private static final int THREAD_AMOUNT = 1; public static void main(String[] args) { Properties props = new Properties(); props.put("zookeeper.connect", "vm1:2181"); props.put("group.id", "group1"); props.put("zookeeper.session.timeout.ms", "400"); props.put("zookeeper.sync.time.ms", "200"); props.put("auto.commit.interval.ms", "1000");; Map<String, Integer> topicCountMap = new HashMap<String, Integer>(); //每个topic使用多少个kafkastream读取, 多个consumer topicCountMap.put(TOPIC, THREAD_AMOUNT); //可以读取多个topic// topicCountMap.put(TOPIC2, 1); ConsumerConnector consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(props)); Map<String, List<KafkaStream<byte[], byte[]>>> msgStreams = consumer.createMessageStreams(topicCountMap ); List<KafkaStream<byte[], byte[]>> msgStreamList = msgStreams.get(TOPIC); //使用ExecutorService来调度线程 ExecutorService executor = Executors.newFixedThreadPool(THREAD_AMOUNT); for (int i = 0; i < msgStreamList.size(); i++) { KafkaStream<byte[], byte[]> kafkaStream = msgStreamList.get(i); executor.submit(new HanldMessageThread(kafkaStream, i)); } //关闭consumer try { Thread.sleep(20000); } catch (InterruptedException e) { e.printStackTrace(); } if (consumer != null) { consumer.shutdown(); } if (executor != null) { executor.shutdown(); } try { if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) { System.out.println("Timed out waiting for consumer threads to shut down, exiting uncleanly"); } } catch (InterruptedException e) { System.out.println("Interrupted during shutdown, exiting uncleanly"); } }}/** * 具体处理message的线程 * @author Administrator * */class HanldMessageThread implements Runnable { private KafkaStream<byte[], byte[]> kafkaStream = null; private int num = 0; public HanldMessageThread(KafkaStream<byte[], byte[]> kafkaStream, int num) { super(); this.kafkaStream = kafkaStream; this.num = num; } public void run() { ConsumerIterator<byte[], byte[]> iterator = kafkaStream.iterator(); while(iterator.hasNext()) { String message = new String(iterator.next().message()); System.out.println("Thread no: " + num + ", message: " + message); } } }

1	`props.put("auto.commit.interval.ms",` `"1000");`

Indicates:how long does the consumer interval update offset on zookeeper

Description

Why use high level Consumer?

In some scenarios, the logic of reading a message from Kafka does not process the offset of the message, just to get the message data. This functionality is provided by the high level consumer.

The first thing to know is that the high level consumer save the latest offset (read from the specified partition) on zookeeper. This offset is based on the consumer group name store.

The Consumer group name is global in the Kafka cluster, and you should be careful not to turn off the Consumer on the cluster when starting the new Consumer group. When a consumer thread is started, Kafka will add it to the same consumer group under the same topic and trigger a redistribution. When reassigned, Kafka assigns partition to consumer, and it is possible to move one partition to another consumer. If the old, new processing logic exists at the same time, it is possible that some messages are passed on to the old consumer.

Design High Level Consumer

The first thing to know about using high levelconsumer is that it should be multithreaded. The number of consumer threads is related to the number of tipic partition, and there are specific rules between them:

If the number of threads is greater than the number of partitions on the topic, some threads will not get any messages
If the number of partitions is greater than the number of threads, some threads will get multiple partitioned messages
If a thread processes multiple partitions of a message, the order in which it receives the message is not guaranteed. For example, a 5 message was obtained from partition 10, 6 messages were obtained from partition 11, 10 were obtained from Partition 5, and 10 were obtained from Partition 5, although there is a message in partition 11.
Adding more consumer with the consumer group will trigger Kafka redistribution, a partition that is assigned to a thread, which may have been assigned to a B thread after the new allocation.

Turn off consumer groups and error handling

Kafka does not update the offset on the zookeeper every time it reads the message, but waits for some time. Due to this delay, it is possible for the consumer to read a message without updating offset. Therefore, when the client shuts down or crashes, some messages are read repeatedly from the new boot. In addition, broker outages or other causes that replace partition's leader can also cause messages to be read repeatedly.

To avoid this problem, you should provide a smooth closing method instead of using the kill-9

The above Java code provides a way to turn off:

12345678910111213 if (consumer != null) { consumer.shutdown();}if (executor != null) { executor.shutdown();}try { if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) { System.out.println("Timed out waiting for consumer threads to shut down, exiting uncleanly"); }} catch (InterruptedException e) { System.out.println("Interrupted during shutdown, exiting uncleanly");}

After shutdown, wait for 5 seconds, give consumer thread time to finish processing the message that is kept in the Kafka stream.

Reference: Https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

From for notes (Wiz)

List of attachments

Java Implementation Kafka Consumer example

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More