Original:https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+ExampleWhy use the high level Consumer
In some scenarios, we want to read messages through multithreading, and we don't care about the order in which messages are consumed from Kafka, we only care about the data being consumed. High level is used to abstract this kind of consumption action.
Message consumption has been consumer group and can have multiple consumer in each consumer group. Each Consumer is a thread, each partition of topic can only be read by one Consumer at a time, and each Consumer of partition group has a value of the latest offset, Stored on the zookeeper. So there will be no repeated consumption of the situation.
- Since consumer's offerset is not delivered to zookeeper in real time (through configuration to make an update cycle). So consumer assumes a sudden crash, it's possible to read repeated messages
design High Level Consumer
The high level Consumer can and should be used in multi-threaded environments. The number of threads in the thread model (which also represents the number of consumer in group) is related to the number of partition in topic. Some rules are listed below:
- When the number of threads supplied is greater than the number of partition, some threads will not receive the message.
- When the number of threads provided is less than the number of partition, some threads receive messages from multiple partition.
- When a thread receives a message from more than one partition, it does not guarantee the order in which the messages are received; There may be 5 messages received from Partition3. Receive 6 messages from Partition4. And then received 10 messages from Partition3;
- When many other threads are joined. Will cause Kafka to do re-balance, which may change the corresponding relationship between partition and thread.
- In order to avoid a situation in which the consumer and the broker will cause the message to be read repeatedly, the Thread.Sleep (10000) lets consumer have time to synchronize the offset to zookeeper before shutdown
Sample Examplemaven Dependency
<!--Kafka messages-- <dependency> <groupId>org.apache.kafka</groupId> < artifactid>kafka_2.10</artifactid> <version>0.8.2.0</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId> kafka-clients</artifactid> <version>0.8.2.0</version> </dependency>
Consumer Threads
Import Kafka.consumer.consumeriterator;import Kafka.consumer.kafkastream;import kafka.message.MessageAndMetadata; public class Consumerthread implements Runnable {private kafkastream kafkastream;//thread number private int threadnumber; public Consumerthread (kafkastream kafkastream, int threadnumber) {this.threadnumber = Threadnumber; This.kafkastream = Kafkastream; } public void Run () {consumeriterator<byte[], byte[]> it = Kafkastream.iterator (); StringBuffer sb = new StringBuffer ();//The loop will continue to read data from Kafka until the process is manually interrupted while (It.hasnext ()) {Messageandmetadata MetaData = It.next (); Sb.append ("Thread:" + threadnumber + ""); Sb.append ("part:" + metadata.partition () + ""); Sb.append ("Key:" + metadata.key () + ""); Sb.append ("Message:" + metadata.message () + ""); Sb.append ("\ n"); System.out.println (Sb.tostring ()); } System.out.println ("Shutting down Thread:" + Threadnumber);}}
remaining Programs
Import Kafka.consumer.consumerconfig;import Kafka.consumer.kafkastream;import Kafka.javaapi.consumer.ConsumerConnector; Import Java.util.hashmap;import java.util.list;import Java.util.map;import Java.util.properties;import Java.util.concurrent.executorservice;import java.util.concurrent.Executors; public class Consumergroupexample {private final consumerconnector consumer; Private final String topic; Private Executorservice executor; Public Consumergroupexample (String a_zookeeper, String a_groupid, String a_topic) {consumer = Kafka.consumer.Consu Mer.createjavaconsumerconnector (Createconsumerconfig (A_zookeeper, a_groupid)); This.topic = A_topic; } public void Shutdown () {if (consumer! = null) Consumer.shutdown (); if (executor! = null) Executor.shutdown (); } public void run (int a_numthreads) {map<string, integer> topiccountmap = new hashmap<string, Integer > (); Topiccountmap.put (topic, NEW Integer (a_numthreads)); The returned MAP includes all topic and the corresponding Kafkastream map<string, list<kafkastream<byte[], byte[]>>> consumermap = C Onsumer.createmessagestreams (TOPICCOUNTMAP); List<kafkastream<byte[], byte[]>> streams = consumermap.get (topic); Create Java thread pool executor = Executors.newfixedthreadpool (a_numthreads); Create consume thread consumption messages int threadnumber = 0; For (final Kafkastream stream:streams) {executor.submit (new consumertest (Stream, threadnumber)); threadnumber++; }} private static Consumerconfig Createconsumerconfig (String a_zookeeper, String a_groupid) {Properties PR OPS = new Properties (); Specifies the connected zookeeper cluster. Through the cluster to store the Offerset props.put ("Zookeeper.connect", A_zookeeper) connected to the consumer of a partition; Consumer group's ID props.put ("group.id", a_groupid); Kafka Wait Zookeeper response Time (ms) Props.put ("zookeeper.session.timeout.ms", "400"); ZooKeeper's ' follower ' can lag behind master for how many milliseconds props.put ("zookeeper.sync.time.ms", "200"); Consumer update Offerset to Zookeeper time Props.put ("auto.commit.interval.ms", "1000"); return new Consumerconfig (props); } public static void Main (string[] args) {String zooKeeper = args[0]; String groupId = args[1]; String topic = args[2]; int threads = Integer.parseint (args[3]); consumergroupexample example = new Consumergroupexample (ZooKeeper, GroupId, topic); Example.run (threads); Since consumer's offerset is not transmitted to zookeeper in real time (through configuration to make an update cycle), shutdown consumer threads may read repeated information//Add sleep time, Let consumer synchronize offset to zookeeper try {Thread.Sleep (10000); } catch (Interruptedexception IE) {} example.shutdown (); }}
Design the Kafka High level Consumer