0
votes

At the company i work with we use Spring for Kafka without authentication and lately we did some experiments to setup the security in Kafka and we enabled authentication for a brief moment which cause a crush in all our consumers/producers within our microservices ! (the microservices stayed up)

The exception :

Authorization Exception and no authorizationExceptionRetryInterval set

org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: foo-group

after some researchs we found out that this is the expected behavior by kafka clients and we needed to set the authorizationExceptionRetryInterval property

public void setAuthorizationExceptionRetryInterval​(java.time.Duration authorizationExceptionRetryInterval)

Set the interval between retries after AuthorizationException is thrown by KafkaConsumer. By default the field is null and retries are disabled. In such case the container will be stopped. The interval must be less than max.poll.interval.ms consumer property.

Here is some other useful links

Setting authorizationExceptionRetryInterval for Spring Kafka

Why does the spring KafkaConsumer suspend all consumption from n topics when one fails to authorize

What i want to know is :

  1. Is a failed authentication the only case when consumers/producers goes down ?
  2. If there are some other cases, how to make sure that our consumers/producers recover without human intervention (restarting the microservices) ? In other word how to check if the consumers/producers are up and restart them otherwise ?
1

1 Answers

1
votes

Containers are stopped only under the following circumstances:

  • AuthorizationException with no authorizationExceptionRetryInterval
  • NoOffsetForPartitionException - thrown when ConsumerConfig.AUTO_OFFSET_RESET_CONFIG is not earliest or latest and there is no existing offset for a partition with this consumer group.
  • FencedInstanceIdException - using transactions and static group members (meaning some other instance is using this instance id).
  • StopAfterFenceException - when stopContainerWhenFenced is true (default false) - only applies with transactions
  • Any Error (such as OOME)