Kafka 0.9.0.0 Recurring consumption problem solving

Last Update:2018-07-26 Source: Internet

Author: User

Tags commit log log min time interval

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

background: before using the Kafka client version is 0.8, recently upgraded the version of the Kafka client, wrote a new consumer and producer code, in the local test no problem, can be normal consumption and production. However, recent projects have used a new version of the code, and when the amount of data is large, there will be recurring consumption problems. The problem of the elimination and resolution process is recorded, to avoid stepping on the pit again.

problem finding: because the Consumerrecord object can get the partition and offset of the current message, the partition and offset of the current message are also recorded in the log log. In the process of monitoring the log, it is found that the offset of one partition appears in multiple threads, and the value of the offset is not the same, so it may be repeated consumption, when the total number of records consumed by the query program and the number of message records in the Kafka vary considerably.

solve the process: Internet search How to solve the problem of Kafka repeat consumption, are said Kafka in session time did not submit offset, so refer to the online thinking, will consumer poll time changed to 100ms, That is, 100ms poll data from Kafka and set Props.put ("auto.commit.interval.ms", "1000"); Props.put ("session.timeout.ms", "30000"), Kafka Auto-commit time interval and session time.

After the test, found that when the Kafka data volume is large, there will be repeated consumption problems, and then print out the number of poll data bar to find a lot of data, and after a period of time (general 30s) will report an error:

Org.apache.kafka.clients.consumer.CommitFailedException:  
     Commit cannot is completed since the group has already Rebalanced and assigned the partitions to another member.   
    This means then the time between subsequent calls to poll () was longer than the configured session.timeout.ms, which typic Ally implies the poll loop is spending too much time message processing.   
    Can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll ( ) with Max.poll.records. [Com.bonc.framework.server.kafka.consumer.ConsumerLoop]

This means that the commit cannot be completed because the group has been rebalanced and the partition is assigned to another member. This means that the time between subsequent calls to call poll () is longer than the configured session.timeout.ms, which usually means that the poll loop spends too much time on message processing. You can resolve this issue by increasing the session timeout or by using max.poll.records to reduce the maximum size of the batches returned in poll ().

Then set the following parameters:

    The number of data bars poll from Kafka  
    //max.poll.records data needs to be processed in session.timeout.ms this time  
    props.put ("Max.poll.records "," 100 ");

The size of this value needs to be evaluated with the length of the session.timeout.ms, i.e. whether the 100 data can be processed within the session.timeout.ms time.

Note:

    Props.put ("session.timeout.ms", "30000");  

    The maximum wait time for a message to be sent. Need to be greater than session.timeout.ms this time  
     props.put ("request.timeout.ms", "40000");

It is also important to note that fetch.min.bytes this parameter configuration, the size of the data pulled from Kafka, this parameter is best set, otherwise it may be problematic. The recommended setting is:

    The minimum data that the server sends to the consumer, if the value is not met, waits until the specified size is met. A default of 1 indicates immediate reception.  
    props.put ("Fetch.min.bytes", "1");

Summarize:

In general, Kafka repeated consumption is due to the non-normal submission of offset, so modify the configuration, the normal submission of offset can be resolved. The main configurations mentioned above are as follows:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Kafka 0.9.0.0 Recurring consumption problem solving

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Kafka 0.9.0.0 Recurring consumption problem solving

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support