Versions Spring Boot 1.5.x, Spring Boot 2.4.x, Apache Kafka 0.10.2
The Situation
We have two service instances hosted on different servers. Each instance initializes multiple Kafka consumers. All consumers are listening to the same topic and are part of the same consumer group. We are not relying on Spring Boot/Spring Kafka to configure the ConcurrentKafkaListnerContainerFactory and its DefaultKafkaConsumerFactory. All the consumer configuration properties are set to the default Apache Kafka consumer property values except for max.poll.records, session.timeout.ms, and heartbeat.interval.ms. Acknowledgement mode is set to record.
We are using the @KafkaListener annotation and setting its containerFactory property with the bean name of the initialized ConcurrentKafkaListenerContainerFactory and setting it topics property.
The Problem
When a topic does not get any messages published to it for a day or two, all consumers are removed from the consumer group. I can’t find any reason for this to happen. From my understanding of reading both the Apache Kafka and Spring Kafka documentation if poll is called within max.poll.interval.ms, the consumer is considered alive. And if heartbeats are continuously sent by the consumer within the session.timeout.ms, the consumer is considered alive. According to the documentation, poll is called continuously and heartbeats are sent at the interval set by heartbeat.interval.ms.
The Questions
- Is there a setting or property Spring Boot/Spring Kafka is setting that causes a consumer that hasn’t consumed any records from an idle topic for a day or two to be removed from the consumer group?
- If yes, can this be turned off and what are the downsides?
- If no, is there a way to rejoin the consumer group without having to restart the service and what are the downsides?