I have use case where i need to read data from topic then batch data(100 records) and write the batch to specific file or external store. I am planning to use processor API for this and batch the data in process method using state store backed by kafka and write to file once the batch size reaches 100 records. Clear the batch from the state store to create fresh new batch.
One more requirements is that we cannot have duplicates in data. This mean same record cannot be in two different batches.
Does streams exactly once fit this use case?? I read in the design that its not recommended if we are batching data and most of the articles around this say that Exactly once works only in the case of consume process and produce pattern.