Design the Kafka High level Consumer

Last Update:2015-03-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original:https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+ExampleWhy use the high level Consumer

In some scenarios, we want to read messages through multithreading, and we don't care about the order in which messages are consumed from Kafka, we just care about the data being consumed. High level is used to abstract this kind of consumption action.
Message consumption has been consumer group, each consumer group can have multiple consumer, each consumer is a thread, topic each partition can only be read by one consumer, Each partition of the Consumer group has an up-to-date offset value, stored on the zookeeper. Therefore, there will be no duplication of consumption situation.
Because consumer's offerset is not transmitted to zookeeper in real time (through configuration to make the update cycle), consumer if suddenly crash, it is possible to read duplicate information

design High Level Consumer

The high level Consumer can and should be used in multi-threaded environments, the number of threads in the threading model (which also represents the number of Consumer in group) and the partition number of topic, and some rules are listed below:

When the number of threads provided exceeds the number of partition, some threads will not receive the message;
When the number of threads provided is less than the number of partition, some threads will receive messages from multiple partition;
When a thread receives a message from more than one partition, it does not guarantee the order of receiving the message, it may receive 5 messages from Partition3, receive 6 messages from Partition4, and then receive 10 messages from Partition3;
When adding more threads, it causes Kafka to do re-balance, which may change the correspondence between partition and threads.
In order to avoid this situation, by Thread.Sleep (10000), let consumer have time to synchronize offset to zookeeper because of the sudden stop of consumer and the fact that the broker causes the message to be read repeatedly shutdown

Examplemaven Dependency

      <!--Kafka messages--        <dependency>            <groupId>org.apache.kafka</groupId>            < artifactid>kafka_2.10</artifactid>            <version>0.8.2.0</version>        </dependency>        <dependency>            <groupId>org.apache.kafka</groupId>            <artifactId> kafka-clients</artifactid>            <version>0.8.2.0</version>        </dependency>

Consumer Threads

Import Kafka.consumer.consumeriterator;import Kafka.consumer.kafkastream;import kafka.message.MessageAndMetadata;  public class Consumerthread implements Runnable {private kafkastream kafkastream;//thread number private int threadnumber; public  Consumerthread (kafkastream kafkastream, int threadnumber) {this.threadnumber = Threadnumber; This.kafkastream = Kafkastream;  } public void Run () {consumeriterator<byte[], byte[]> it = Kafkastream.iterator ();  StringBuffer sb = new StringBuffer ();//The loop will continue to read data from Kafka until the process is manually interrupted while (It.hasnext ()) {Messageandmetadata MetaData   = It.next ();   Sb.append ("Thread:" + threadnumber + "");   Sb.append ("part:" + metadata.partition () + "");   Sb.append ("Key:" + metadata.key () + "");   Sb.append ("Message:" + metadata.message () + "");   Sb.append ("\ n");  System.out.println (Sb.tostring ()); } System.out.println ("Shutting down Thread:" + Threadnumber);}}

remaining Programs

Import Kafka.consumer.consumerconfig;import Kafka.consumer.kafkastream;import Kafka.javaapi.consumer.ConsumerConnector; Import Java.util.hashmap;import java.util.list;import Java.util.map;import Java.util.properties;import Java.util.concurrent.executorservice;import java.util.concurrent.Executors;    public class Consumergroupexample {private final consumerconnector consumer;    Private final String topic;     Private Executorservice executor; Public Consumergroupexample (String a_zookeeper, String a_groupid, String a_topic) {consumer = Kafka.consumer.Consu        Mer.createjavaconsumerconnector (Createconsumerconfig (A_zookeeper, a_groupid));    This.topic = A_topic;        } public void Shutdown () {if (consumer! = null) Consumer.shutdown ();    if (executor! = null) Executor.shutdown (); } public void run (int a_numthreads) {map<string, integer> topiccountmap = new hashmap<string, Integer        > (); Topiccountmap.put (topic, NEW Integer (a_numthreads)); The returned MAP contains all the topic as well as the corresponding Kafkastream map<string, list<kafkastream<byte[], byte[]>>> consumermap = C        Onsumer.createmessagestreams (TOPICCOUNTMAP);         List<kafkastream<byte[], byte[]>> streams = consumermap.get (topic);         Create Java thread pool executor = Executors.newfixedthreadpool (a_numthreads);        Create consume thread consumption messages int threadnumber = 0;            For (final Kafkastream stream:streams) {executor.submit (new consumertest (Stream, threadnumber));        threadnumber++; }} private static Consumerconfig Createconsumerconfig (String a_zookeeper, String a_groupid) {Properties PR        OPS = new Properties ();       Specifies the connected zookeeper cluster through which to store Offerset props.put ("Zookeeper.connect", A_zookeeper) connected to a consumer of a partition;        Consumer group's ID props.put ("group.id", a_groupid); Kafka Wait Zookeeper response Time (ms) Props.put ("zookeeper.session.timeout.ms", "400");      ZooKeeper's ' follower ' can lag behind master for how many milliseconds props.put ("zookeeper.sync.time.ms", "200");         Consumer update Offerset to Zookeeper time Props.put ("auto.commit.interval.ms", "1000");    return new Consumerconfig (props);        } public static void Main (string[] args) {String zooKeeper = args[0];        String groupId = args[1];        String topic = args[2];         int threads = Integer.parseint (args[3]);        consumergroupexample example = new Consumergroupexample (ZooKeeper, GroupId, topic);         Example.run (threads); Because consumer's offerset is not delivered to zookeeper in real time (through configuration to make an update cycle), shutdown consumer threads may read duplicate information//increase sleep time,        Let consumer synchronize offset to zookeeper try {Thread.Sleep (10000);    } catch (Interruptedexception IE) {} example.shutdown (); }}

Design the Kafka High level Consumer

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Design the Kafka High level Consumer

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Design the Kafka High level Consumer

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support