At the company i work with we use Spring for Kafka
without authentication and lately we did some experiments to setup the security in Kafka and we enabled authentication for a brief moment which cause a crush in all our consumers/producers within our microservices ! (the microservices stayed up)
The exception :
Authorization Exception and no authorizationExceptionRetryInterval set
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: foo-group
after some researchs we found out that this is the expected behavior by kafka clients and we needed to set the authorizationExceptionRetryInterval
property
Set the interval between retries after AuthorizationException is thrown by KafkaConsumer. By default the field is null and retries are disabled. In such case the container will be stopped. The interval must be less than max.poll.interval.ms consumer property.
Here is some other useful links
Setting authorizationExceptionRetryInterval for Spring Kafka
Why does the spring KafkaConsumer suspend all consumption from n topics when one fails to authorize
What i want to know is :
- Is a failed authentication the only case when consumers/producers goes down ?
- If there are some other cases, how to make sure that our consumers/producers recover without human intervention (restarting the microservices) ? In other word how to check if the consumers/producers are up and restart them otherwise ?