I have a use case where I need 100% reliability, idempotency (no duplicate messages) as well as order-preservation in my Kafka partitions. I'm trying to set up a proof of concept using the transactional API to achieve this. There is a setting called 'isolation.level' that I'm struggling to understand.
In this article, they talk about the difference between the two options
There are now two new isolation levels in Kafka consumer:
read_committed: Read both kind of messages that are not part of a transaction and that are, after the transaction is committed. Read_committed consumer uses end offset of a partition, instead of client-side buffering. This offset is the first message in the partition belonging to an open transaction. It is also known as “Last Stable Offset” (LSO). A read_committed consumer will only read up till the LSO and filter out any transactional messages which have been aborted.
read_uncommitted: Read all messages in offset order without waiting for transactions to be committed. This option is similar to the current semantics of a Kafka consumer.
The performance implication here is obvious but I'm honestly struggling to read between the lines and understand the functional implications/risk of each choice. It seems like read_committed
is 'safer' but I want to understand why.