I have developed Kafka consumer application using spring-kafka library and used default consumer configurations with Manual commits.
I am running two instances of application listening to two different Kafka topics. While performing load testing I observed that I am getting below error in only one of the application for higher load:
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member.
This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms,
which typically implies that the poll loop is spending too much time message processing.
You can address this either by increasing the session timeout or by reducing the maximum size of batches
returned in poll() with max.poll.records.
org.apache.kafka.clients.consumer.CommitFailedException:
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member.
This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms,
which typically implies that the poll loop is spending too much time message processing.
You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
\n org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:725)
I read several articles and found if consumer spending to much time in processing message and broker is not getting information about consumer liveliness then consumer rebalancing happens and above exception will be thrown for uncommited messages.
I have resolved above error by setting max.poll.interval.ms to INEGER.MAX_VALUE. But I am wondering why I am getting above error only in one of instance and why other instance working as expected for higher loads.
Can anyone please share correct root cause and ideal value for max.poll.interval.ms or appropriate solution for this issue