background: before using the Kafka client version is 0.8, recently upgraded the version of the Kafka client, wrote a new consumer and producer code, in the local test no problem, can be normal consumption and production. However, recent projects have used a new version of the code, and when the amount of data is large, there will be recurring consumption problems. The problem of the elimination and resolution process is recorded, to avoid stepping on the pit again.
problem finding: because the Consumerrecord object can get the partition and offset of the current message, the partition and offset of the current message are also recorded in the log log. In the process of monitoring the log, it is found that the offset of one partition appears in multiple threads, and the value of the offset is not the same, so it may be repeated consumption, when the total number of records consumed by the query program and the number of message records in the Kafka vary considerably.
solve the process: Internet search How to solve the problem of Kafka repeat consumption, are said Kafka in session time did not submit offset, so refer to the online thinking, will consumer poll time changed to 100ms, That is, 100ms poll data from Kafka and set Props.put ("auto.commit.interval.ms", "1000"); Props.put ("session.timeout.ms", "30000"), Kafka Auto-commit time interval and session time.
After the test, found that when the Kafka data volume is large, there will be repeated consumption problems, and then print out the number of poll data bar to find a lot of data, and after a period of time (general 30s) will report an error:
Org.apache.kafka.clients.consumer.CommitFailedException:
Commit cannot is completed since the group has already Rebalanced and assigned the partitions to another member.
This means then the time between subsequent calls to poll () was longer than the configured session.timeout.ms, which typic Ally implies the poll loop is spending too much time message processing.
Can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll ( ) with Max.poll.records. [Com.bonc.framework.server.kafka.consumer.ConsumerLoop]
This means that the commit cannot be completed because the group has been rebalanced and the partition is assigned to another member. This means that the time between subsequent calls to call poll () is longer than the configured session.timeout.ms, which usually means that the poll loop spends too much time on message processing. You can resolve this issue by increasing the session timeout or by using max.poll.records to reduce the maximum size of the batches returned in poll ().
Then set the following parameters:
The number of data bars poll from Kafka
//max.poll.records data needs to be processed in session.timeout.ms this time
props.put ("Max.poll.records "," 100 ");
The size of this value needs to be evaluated with the length of the session.timeout.ms, i.e. whether the 100 data can be processed within the session.timeout.ms time.
Note:
Props.put ("session.timeout.ms", "30000");
The maximum wait time for a message to be sent. Need to be greater than session.timeout.ms this time
props.put ("request.timeout.ms", "40000");
It is also important to note that fetch.min.bytes this parameter configuration, the size of the data pulled from Kafka, this parameter is best set, otherwise it may be problematic. The recommended setting is:
The minimum data that the server sends to the consumer, if the value is not met, waits until the specified size is met. A default of 1 indicates immediate reception.
props.put ("Fetch.min.bytes", "1");
Summarize:
In general, Kafka repeated consumption is due to the non-normal submission of offset, so modify the configuration, the normal submission of offset can be resolved. The main configurations mentioned above are as follows: