There are 3 types of consumption patterns in Kafka: At most once, at least once, just once. Why there are 3 modes, because the client processes the message, the submission of feedback (commit) These two actions are not atomic. 1. At most one time: After the client receives the message, it is automatically submitted before processing the message, so Kafka that the consumer has been consumed and the offset is increased. 2. At least once: the client receives the message, processes the message, and then submits the feedback. This can occur when the message is finished, before submitting feedback, the network interruption or program hangs, then Kafka that the message has not been consumer consumption, resulting in a duplicate message push. 3. Just once: ensure that message processing and submission feedback are atomic in the same transaction.
Starting from these points, this paper elaborates how to realize the above three methods in detail. 1.at-most-once (up to one time) set Enable.auto.commit to Ture to set auto.commit.interval.ms to a smaller time interval. The client does not call Commitsync () and Kafka automatically commits within a specific time interval. Example
Method OneSet Enable.auto.commit to False client call Commitsync (), increase message offset; public void Leastonce () {
Properties props = new Properties ();
Props.put ("Bootstrap.servers", "localhost:9092");
Props.put ("Group.id", "test-1");
Props.put ("Enable.auto.commit", "false"); Cancel Auto
-Submit props.put ("Key.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
Props.put ("Value.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
kafkaconsumer<string, string> consumer = new kafkaconsumer<> (props);
Consumer.subscribe (Arrays.aslist ("My-topic", "Bar"));
while (true) {
consumerrecords<string, string> records = Consumer.poll (+);
For (consumerrecord<string, string> record:records)
process (record);
Consumer.commitasync (); Submit offset
}
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17Method Two Set Enable.auto.commit to Ture to set auto.commit.interval.ms to a large time interval. Client calls Commitsync () to increase message skew; Example public void Leastonce () {Properties props = new Properties ();
Props.put ("Bootstrap.servers", "10.242.1.219:9092");
Props.put ("Group.id", "test-1"); Props.put ("Enable.auto.commit", "true");
Automatic submission of Props.put ("auto.commit.interval.ms", "99999999");
Props.put ("Key.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
Props.put ("Value.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
kafkaconsumer<string, string> consumer = new kafkaconsumer<> (props);
Consumer.subscribe (Arrays.aslist ("My-topic", "Bar"));
while (true) {consumerrecords<string, string> records = Consumer.poll (100);
For (consumerrecord<string, string> record:records) process (record); Consumer.commitasync (); Submit offset}}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
3.1 IdeasIf you want to implement this method, you must control the offset of the message yourself and record the current offset,The processing of messages and the movement of offset must remain in the same transaction, for example, in the same transaction, the results of message processing are stored in the MySQL database and updated at this timeThe offset of the message。3.2 ImplementationSet Enable.auto.commit to False to save the offset in Consumerrecord to the database when the partition partition changes, you need to rebalance, there are several events that trigger partition changes
1 the partition size in the topic of the consumer subscription changes 2 topic is created or deleted 3 Consuer a member of the group hangs 4 new consumer joins the group by calling joinAt this point consumer the offset by implementing the Consumerrebalancelistener interface to capture these events. Consumer moves to the specified partition by calling the Seek (topicpartition, long) method at the offset position. 3.3 The experiment first needs to establish the offset record in the storage topic,partition, the table is as follows
DROP TABLE IF EXISTS ' tb_yx_message ';
CREATE TABLE ' tb_yx_message ' (
' id ' bigInt () not NULL auto_increment COMMENT ' primary key ',
' topic ' varchar ($) Not NUL L default ' COMMENT ' theme ',
' kpartition ' varchar (+) not NULL DEFAULT ' 0 ' COMMENT ' partition ',
' offset ' bigInt (not NULL DEFAULT ' COMMENT ' offset ',
PRIMARY key (' id '),
UNIQUE key ' Uniq_key ' (topic,kpartition)
) Engine=innodb DEFAULT charset=utf8mb4 comment= ' partition message table '; #约10万
1 2 3 4 5 6 7 8 9 10 2. Create the Messagedao class for the operational database at the same timePublic interface Messagedao {
@Insert ("Insert into tb_yx_message (topic,kpartition,offset)" +
"values (#{ Topic},#{kpartition},#{offset}) public
int Insertwinner (Messageoffsetpo messageoffset);
Gets the offset
@Select ("SELECT * from Tb_yx_message" +
"where Topic=#{topic} and Kpartition=#{kpartition}")
Publi C Messageoffsetpo Get (@Param ("topic") string topic,
@Param ("kpartition") string partition);
Update offset
@Update ("Update tb_yx_message set Offset=#{offset}" +
"where topic=#{topic} and kpartition=#{ Kpartition} ") Public
int update (@Param (" offset ") long offset,
@Param (" topic ") String topic,
@Param (" Kpartition ") String partition);
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 193. The next need to implement the Consumerrebalancelistener interface, in the partition rebalance, the order of the call: First Call onpartitionsrevoked (notify consumer task is canceled), Call Onpartitionsassigned again (notifies consumer that a new task is coming).
4. When we receive the cancellation of the task, save the corresponding offset to the database, and when the new task arrives, read out the corresponding partition offset from the database (for example, just start), the implementation is as follows.
public class Myconsumerrebalancerlistener implements Org.apache.kafka.clients.consumer.ConsumerRebalanceListener {p
Rivate consumer<string, string> Consumer;
Private Messagedao Messagedao; Public Myconsumerrebalancerlistener (consumer<string, string> Consumer, Messagedao Messagedao) {This.consume
r = Consumer;
This.messagedao = Messagedao; }//The task was canceled public void onpartitionsrevoked (collection<topicpartition> partitions) {for (topicpartit
Ion partition:partitions) {Long offset = consumer.position (partition);
MESSAGEOFFSETPO build = Messageoffsetpo. Builder (). Offset (offset)
. Kpartition (Partition.partition () + ""). Topic (Partition.topic ()). build ();
try {messagedao.insertwinner (build); } catch (Exception e) {} log.info ("OnpartitionSrevoked topic:{},build:{} ", Partition.topic (), build); }}//Receive new Task public void onpartitionsassigned (collection<topicpartition> partitions) {for (Topi Cpartition partition:partitions) {Messageoffsetpo messageoffsetpo = Messagedao.get (Partition.topic (), parti
Tion.partition () + "");
if (messageoffsetpo==null) {//receives a new topic, that is, the record in the database does not exist Consumer.seek (partition,0);
}else{Consumer.seek (Partition,messageoffsetpo.getoffset () +1);//Next offset, so you need to add 1}
Log.info ("Onpartitionsassigned topic:{},messageoffsetpo:{},offset:{}", PARTITION,MESSAGEOFFSETPO); }
}
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 forClient SideThe code can be written like this /** * Just once */public void ExactlyOnce () {Properties props = new Properties ();
Props.put ("Bootstrap.servers", "localhost:9092");
Props.put ("Group.id", "test-1"); Props.put ("Enable.auto.commit", "false");
Cancel Auto-Submit props.put ("Key.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
Props.put ("Value.deserializer", "Org.apache.kafka.common.serialization.StringDeserializer");
kafkaconsumer<string, string> consumer = new kafkaconsumer<> (props);
Myconsumerrebalancerlistener Rebalancerlistener = new Myconsumerrebalancerlistener (Consumer,messagedao);
Consumer.subscribe (Arrays.aslist ("Test-new-topic-1", "New-topic"), Rebalancerlistener);
while (true) {consumerrecords<string, string> records = Consumer.poll (100);
For (consumerrecord<string, string> record:records) {Boolean isexception=false; try {
Log.info ("Consume record:{}", record);
Processservice.process (record);
} catch (Exception e) {e.printstacktrace ();
Isexception=true; } if (isexception) {//Handles an exception, indicating that the data has not been consumed and that it needs to be moved to that location to continue processing in order to be able to be consumed Topi
Cpartition topicpartition=new topicpartition (Record.topic (), record.partition ());
Consumer.seek (Topicpartition,record.offset ());
Log.info ("Consume exception offset:{}", Record.offset ());
Break }//Rebalancerlistener.getoffsetmananger (). Saveoffsetinexternalstore (Record.topic (), Record.kpartition (),
Record.offset ()); }
}
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 19th Lineprocessservice.process (record);Consume the message and record the offset of the message, the Processservice code is as follows@Service
@Slf4j Public
class Processservice {
@Autowired
Messagedao Messagedao;
@Transactional (rollbackfor = exception.class) public
void Process (consumerrecord<string, string> record) {
log.info ("record:{}", record);
To handle the message, here is simply a printout of the
System.out.println (">>>>>>>>>" +thread.currentthread (). GetName () + "_" +record);
Update offset
messagedao.update (Record.offset (), Record.topic (), record.partition () + "");}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 163.4 Run the program. Scenario 1:consumer Subscription topic--test-new-topic-1 (not previously subscribed), then producer sends two messages to consumer, consumer receives the message, does not throw an exception. Get the log as follows:
2018-01-11 14:36:07,948 main INFO myconsumerrebalancerlistener.onpartitionsassigned:53-onpartitionsassigned topic: test-new-topic-1-0,messageoffsetpo:null,offset:{} 2018-01-11 14:37:15,156 main INFO kafkaconsumetest.exactlyonce : 106-consume record:consumerrecord (topic = test-new-topic-1, partition = 0, offset = 0, Createtime = 1515652635056, seria lized Key size = 1, serialized value size = +, headers = recordheaders (headers = [], isreadonly = false), key = 0, value = message:0-test) 2018-01-11 14:37:15,183 main INFO processservice.process:23-record:consumerrecord (topic = Test-new-t Opic-1, partition = 0, offset = 0, Createtime = 1515652635056, serialized key size = 1, serialized value size = +, header s = recordheaders (headers = [], isreadonly = false), key = 0, value = message:0-test) 2018-01-11 14:37:15,238 main INFO K Afkaconsumetest.exactlyonce:106-consume Record:consumerrecord (topic = test-new-topic-1, partition = 0, offset = 1, Creat ETime = 1515652635062, serialized key SizE =