0
votes

I'd like for the latest checkpoint to be loaded in Flink but it just isn't. I've written a word count application that is meant to resume counting where it left off after a restart. I am running it from my IDE so I'm not starting a Flink cluster.

Here's the code I wrote https://github.com/edu05/wordcount/tree/simple Which is inspired from the example provided by the Flink creators on checkpointing... https://github.com/streaming-with-flink/examples-scala

What am I missing? How can I also avoid re-printing some of the word counts? I don't see many contributors in Stackoverflow on Apache Flink, is there another more appropriate forum?

1
Are you getting some error ? Can you show us how did you try to run from a checkpoint ? - Ricardo Alvaro Lohmann
I can't see any error but when I start up the app the word counts start from zero. I am not taking any special actions to run from a checkpoint, I thought this happened automatically. I'm just running the main() method from my IDE, waiting to see the word counts going up, waiting for a couple of checkpoints to be produced (I only ascertained this from the logs); I then stop the app from the IDE and then I rerun the same main() method again from my IDE. My expectation was for the app to resume with the previous word counts. Am I doing something wrong? - edu

1 Answers

2
votes

Checkpoints are by default not retained and are only used to resume a job from failures.

If you need to start your job from a retained checkpoint you have to do it manually just as from savepoint by the following way:

$ bin/flink run -s :checkpointMetaDataPath [:runArgs]