1
votes

There's an excellent series of articles regarding Kafka transactions and exactly once delivery.
In one of them the author says about consumers:

So on the Consumer side, you have two options for reading transactional messages, expressed through the “isolation.level” consumer config:

read_committed: In addition to reading messages that are not part of a transaction, also be able to read ones that are, after the transaction is committed.

read_uncommitted: Read all messages in offset order without waiting for transactions to be committed. This option is similar to the current semantics of a Kafka consumer.

That is to say, normal consumer applications must specify read_committed if the only want to read commit writes from the topic.

However regarding Kafka Streams:

All you need to make your Streams application employ exactly once semantics, is to set this config “processing.guarantee=exactly_once”. This causes all of the processing to happen exactly once; this includes making both the processing and also all of the materialized state created by the processing job that is written back to Kafka, exactly once.

Nothing is explicitly said about the reads in the KStream. When exactly_once is configured, does the KStream only read committed messages?

1

1 Answers

2
votes

Yes KStream will only read committed messages, it's not clearly stated in the documentation but in the StreamsConfig JavaDoc you will find the information:

If "processing.guarantee" is set to "exactly_once", Kafka Streams does not allow users to overwrite the following properties (Streams setting shown in parentheses): "isolation.level" (read_committed) - Consumers will always read committed data only "enable.idempotence" (true) - Producer will always have idempotency enabled "max.in.flight.requests.per.connection" (5) - Producer will always have one in-flight request per connection