3
votes

I have kafka producer with idempotence enabled (without exactly once semantics or transaction enabled) within a rest endpoint call. The reason I enabled it is because the I did not want any duplicates cause by the kafka retries. I am concerned about the following:

  • Will having idempotence slow down my endpoint? (This endpoint needs to be really fast)
  • I read the kafka api doc, that enabling idempotence will make the retries infinite (what ?)
  • Do I really need idempotence if i am not using it with transactions ?
1

1 Answers

5
votes

Update for Apache Kafka 3.0 According to the Announcement of Apache Kafka 3.0 the producer enables the strongest delivery guarantees by default (acks=allenable.idempotence=true). This means that users now get ordering and durability by default.


"Will having idempotence slow down my endpoint? (This endpoint needs to be really fast)"

Kafka allows producing message idempotently by using an internal sequence number. This is cached and compared on the broker side so producing the messages is a bit more time-consuming. In addition, although you can have multiple write requests in-flight, if one fails the few subsequent ones will fail with a retriable OutOfSequenceException which could also slow down your producer.

However, those two are really minor additions compared to a producer with disabled idempotence and I am not aware of any comprehensive benchmark measuring the difference in throughput or latency. Best would be to test it out on your actual environment.

"I read the kafka api doc, that enabling idempotence will make the retries infinite (what ?)"

According to the description of the configuration enable.idempotence it says: "When set to 'true', the producer will ensure that exactly one copy of each message is written in the stream. If 'false', producer retries due to broker failures, etc., may write duplicates of the retried message in the stream. Note that enabling idempotence requires max.in.flight.requests.per.connection to be less than or equal to 5, retries to be greater than 0 and acks must be 'all'. If these values are not explicitly set by the user, suitable values will be chosen. If incompatible values are set, a ConfigException will be thrown."

Remember that the default value for retries is anyway 2147483647 (what I think is meant with infinite). Feel free to set this value to a lower number but still greater than 0.

Regarding the ordering guarantees of an idempotent KafkaProducer even with more than one in-flight request, I have written an answer here.

"Do I really need idempotence if i am not using it with transactions ?"

I can't tell what your requirements are, but enabling idempotence on the producer side ensures that duplicates are not created due to broker or producer failures.

Remember that transactions in Kafka have two sides, not only one the Producer but also on the Consumer. If you use transactions you also want to look into your consumer configuration isolation.level.