0
votes

I have a Apache Flink Job, implemented with the DataStream API, which contains some initialization code before the definition and submission of the job graph. The initialization code should only run the first time the job is submitted and not when resuming the job from a checkpoint or when updating it using a savepoint.

It seems that when restarting the job during a failover from a checkpoint, the job is restarted from a job graph stored in the checkpoint - in particular, the initialization code is not run a second time (which is what I want).

Is the same possible when running a job from a savepoint? In other words, is there a way to execute code only when the job is not started from a savepoint?

1

1 Answers

0
votes

If you implement the CheckpointedFunction interface, then initializeState(FunctionInitializationContext context) will be called during initialization. Then you can use context.isRestored() to determine whether the job is being started for the first time, or not.