Consume messages without committing from Kafka 10 consumer

Question

I have a requirement to read messages from a topic, batch them and push the batch to an external system. If the batch fails for any reason, I need to consume the same set of messages again and repeat the process. So for every batch, the from and to offsets for each partition are stored in a database. In order to achieve this, I am creating one Kafka consumer per partition by assigning partition to the reader, based on the previous offsets stored, the consumers seek to that position and start reading. I have turned off auto commit and I dont commit offsets from the consumer. For every batch, I create a new consumer per partition, read messages from the last offset stored and publish to the external system. Do you see any problems in consuming messages without committing offsets and using the same consumer group across batches, but at any point there won't be more than one consumer per partition ?

Matthias J. Sax Matthias J. Sax · Accepted Answer · 2016-11-07T03:49:46

Your design seems reasonable to me.

Committing offsets to Kafka is just a convenient built-in mechanism within Kafka to keep track of offsets. However, there is no requirement whatsoever to use it -- you can use any other mechanism to track offsets, too (like using a DB as in your case).

Furthermore, if you assign partitions manually, there will be no group management anyway. So parameter group.id has no effect. See http://docs.confluent.io/current/clients/consumer.html for more details.

Consume messages without committing from Kafka 10 consumer

2 Answers