I want to run an apache flink (1.11.1) streaming application on kubernetes. With a filesystem state backend saving to s3. Checkpointing to s3 is working
args:
- "standalone-job"
- "-s"
- "s3://BUCKET_NAME/34619f2862ce3e5fc91d80eae13a434a/chk-4/_metadata"
- "--job-classname"
- "com.abc.def.MY_JOB"
- "--kafka-broker"
- "KAFKA_HOST:9092"
So the problem that I'm facing is:
- I have to select the previous state dir manually. Is there a possibility to make it better?
- The job increments the chk dir but it does not use the checkpoint. Means I throw a new event when I have seen an event for the first time and store it to a
ListState<String>
whenever I deploy via Gitlab a newer version of my application it again throws this event. - Why do I have to enable the checkpointing explicitly in my code when I have defined the state.backend to filesystem?
env.enableCheckpointing(Duration.ofSeconds(60).toMillis());
andenv.getCheckpointConfig().enableExternalizedCheckpoints(RETAIN_ON_CANCELLATION);