If I seek back to the start of my topic, I may have millions of messages, I may want to process these in batches not all at once and commit the offset after each batch. How can I do this given that poll seems to fetch everything after the current offset and commit commits the offset at the end of what poll returned?
1
votes
1 Answers
1
votes
You can put an upper bound in the data that comes from each partition using max.partition.fetch.bytes
the only downside is that records can only be that big, so if you don't know how big the records can be maybe this is not the best solution.
Each record that is returned from Kafka has the topic, partition and offset on that partition, so when you process the entire batch (or maybe you want to do this after processing each message so if your consumer goes down you don't process messages twice) you can sync or async commit the offset.
KafkaConsumer
– shmish111