Java Implementation Kafka Consumer example

Source: Internet
Author: User
Tags zookeeper

Using Java to implement Kafka consumers

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465 66676869707172737475767778798081828384858687888990919293949596979899 package com.lisg.kafkatest;import java.util.HashMap;import java.util.List;import java.util.Map;import java.util.Properties;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;import java.util.concurrent.TimeUnit;import kafka.consumer.Consumer;import kafka.consumer.ConsumerConfig;import kafka.consumer.ConsumerIterator;import kafka.consumer.KafkaStream;import kafka.javaapi.consumer.ConsumerConnector;/** * java实现Kafka消费者的示例 * @author lisg * */public class KafkaConsumer {    private static final String TOPIC = "test";    private static final int THREAD_AMOUNT = 1;    public static void main(String[] args) {                Properties props = new Properties();        props.put("zookeeper.connect", "vm1:2181");        props.put("group.id", "group1");        props.put("zookeeper.session.timeout.ms", "400");        props.put("zookeeper.sync.time.ms", "200");        props.put("auto.commit.interval.ms", "1000");;                Map<String, Integer> topicCountMap = new HashMap<String, Integer>();        //每个topic使用多少个kafkastream读取, 多个consumer        topicCountMap.put(TOPIC, THREAD_AMOUNT);        //可以读取多个topic//      topicCountMap.put(TOPIC2, 1);        ConsumerConnector consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(props));        Map<String, List<KafkaStream<byte[], byte[]>>> msgStreams = consumer.createMessageStreams(topicCountMap );        List<KafkaStream<byte[], byte[]>> msgStreamList = msgStreams.get(TOPIC);                //使用ExecutorService来调度线程        ExecutorService executor = Executors.newFixedThreadPool(THREAD_AMOUNT);        for (int i = 0; i < msgStreamList.size(); i++) {            KafkaStream<byte[], byte[]> kafkaStream = msgStreamList.get(i);            executor.submit(new HanldMessageThread(kafkaStream, i));        }                        //关闭consumer        try {            Thread.sleep(20000);        } catch (InterruptedException e) {            e.printStackTrace();        }        if (consumer != null) {            consumer.shutdown();        }        if (executor != null) {            executor.shutdown();        }        try {            if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) {                System.out.println("Timed out waiting for consumer threads to shut down, exiting uncleanly");            }        } catch (InterruptedException e) {            System.out.println("Interrupted during shutdown, exiting uncleanly");        }    }}/** * 具体处理message的线程 * @author Administrator * */class HanldMessageThread implements Runnable {    private KafkaStream<byte[], byte[]> kafkaStream = null;    private int num = 0;        public HanldMessageThread(KafkaStream<byte[], byte[]> kafkaStream, int num) {        super();        this.kafkaStream = kafkaStream;        this.num = num;    }    public void run() {        ConsumerIterator<byte[], byte[]> iterator = kafkaStream.iterator();        while(iterator.hasNext()) {            String message = new String(iterator.next().message());            System.out.println("Thread no: " + num + ", message: " + message);        }    }    }
1 props.put("auto.commit.interval.ms", "1000");

Indicates:how long does the consumer interval update offset on zookeeper

Description

Why use high level Consumer?

In some scenarios, the logic of reading a message from Kafka does not process the offset of the message, just to get the message data. This functionality is provided by the high level consumer.

The first thing to know is that the high level consumer save the latest offset (read from the specified partition) on zookeeper. This offset is based on the consumer group name store.

The Consumer group name is global in the Kafka cluster, and you should be careful not to turn off the Consumer on the cluster when starting the new Consumer group. When a consumer thread is started, Kafka will add it to the same consumer group under the same topic and trigger a redistribution. When reassigned, Kafka assigns partition to consumer, and it is possible to move one partition to another consumer. If the old, new processing logic exists at the same time, it is possible that some messages are passed on to the old consumer.

Design High Level Consumer

The first thing to know about using high levelconsumer is that it should be multithreaded. The number of consumer threads is related to the number of tipic partition, and there are specific rules between them:

    • If the number of threads is greater than the number of partitions on the topic, some threads will not get any messages

    • If the number of partitions is greater than the number of threads, some threads will get multiple partitioned messages

    • If a thread processes multiple partitions of a message, the order in which it receives the message is not guaranteed. For example, a 5 message was obtained from partition 10, 6 messages were obtained from partition 11, 10 were obtained from Partition 5, and 10 were obtained from Partition 5, although there is a message in partition 11.

    • Adding more consumer with the consumer group will trigger Kafka redistribution, a partition that is assigned to a thread, which may have been assigned to a B thread after the new allocation.

Turn off consumer groups and error handling

Kafka does not update the offset on the zookeeper every time it reads the message, but waits for some time. Due to this delay, it is possible for the consumer to read a message without updating offset. Therefore, when the client shuts down or crashes, some messages are read repeatedly from the new boot. In addition, broker outages or other causes that replace partition's leader can also cause messages to be read repeatedly.

To avoid this problem, you should provide a smooth closing method instead of using the kill-9

The above Java code provides a way to turn off:

12345678910111213 if (consumer != null) {    consumer.shutdown();}if (executor != null) {    executor.shutdown();}try {    if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) {        System.out.println("Timed out waiting for consumer threads to shut down, exiting uncleanly");    }} catch (InterruptedException e) {    System.out.println("Interrupted during shutdown, exiting uncleanly");}

After shutdown, wait for 5 seconds, give consumer thread time to finish processing the message that is kept in the Kafka stream.

Reference: Https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example









From for notes (Wiz)

List of attachments

    Java Implementation Kafka Consumer example

    Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.