2
votes

I deal with timeseries data for a live application. So old data has no significance. I just want to process data received after the stream app has started and not from previously committed offset. What is the correct way to ignore old records on kafka stream app after restart?

With kafka consumer API I generally used the seekToEnd() method to skip forward to the latest record. Is there a equivalent mechanism for streams? I want to avoid filtering through all messages since last commit to ignore old messages.

1
Hey stanley, did you get any workaround for this? - Amanpreet Khurana

1 Answers

0
votes

You can create another consumer using Kafka Consumer API with groupId same as the applicationId for kafka-streams and use that consumer to do a seekToEnd() before starting your stream. Disable autoCommit for this special consumer and commit the offset manually after seekToEnd(). Then try starting your stream.

Make sure the stream has not started until your offsets from reset consumer are committed.