Savepoints vs externalized checkpoints

Question

As per the subject, did I understand correctly (from https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html#difference-to-savepoints), that the only functional difference (except storage format, and that savepoints can't be incremental) is that savepoint state supports rescaling (stop-change_parallelism-start, right?) whereas checkpointed doesn't?
What else (the doc says "features like rescaling") savepoints support that checkpoints don't?
Doesn't that seem weird to have these two such similar yet complex entites? Any plans to merge them?
Are there plans to support checkpointed state + rescaling (that would probably be required for the autoscaling feature)?
Would I lose much if I switch from externalized checkpoints to a custom external periodically savepointing service?

Dawid Wysakowicz Dawid Wysakowicz · Accepted Answer · 2018-07-13T09:18:08

First of all just clarify biggest difference is the that they use different storage formats. Checkpoints use storage backend native format (e.g. RockDB) whereas savepoints use flink native format. This differentiation allows few use cases that would not be possible otherwise. I think that answers the first point.

Ad.2 That said it is true that you can rescale only with savepoint, but e.g. checkpoints are required to perform local recovery (available in 1.5+). Another important difference is that you should be able to switch state backend with savepoint, but you cannot do it with checkpoints(as they use native formats)

Ad.3 I think with the above explanation the answer should be rather simple. It is not weird and don't think there are plans to do so.

Ad.4 Auto rescaling is definitely on the roadmap, but don't think there are set schedules yet.

Ad.5 If you disable checkpointing at all you loose automatic recovery. If you just switch from external to flink managed checkpoints. You shouldn't loose much.

Savepoints vs externalized checkpoints

1 Answers