This question is basically similar to the one asked here: Apache Flink fault tolerance. i.e. what happens if a job restarts between two checkpoints? Will it reprocess the records that were already processed after the last checkpoint?
Take for example I have two jobs, job1 and job2. Job1 consumes records from Kafka, processes them and again produces them to second Kafka topic. Job2 consumes from this second topic and processes the records (in my case its updating values in aerospike using AerospikeClient).
Now from the answer to this question Apache Flink fault tolerance, I can somehow believe that if job1 restarts, it will not produce duplicates records in the sink. I am using FlinkKafkaProducer011 which extends TwoPhaseCommitSinkFunction (https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html). Please explain how it will prevent reprocessing (ie duplicate production of records to Kafka).
According to Flink doc, flink restarts a job from last successful checkpoint. So if job2 restarts before completing the checkpoint, it will restart from last checkpoint and the records that were already processed after that last checkpoint will be reprocessed (ie multiple updations in aerospike). Am I right or is there something else in Flink (& aerospike) that prevents this reprocessing in job2?