0
votes

I am new to Flink and haven't got a chance to read Flink source code to understand JobManager, TaskManager and Task Slot for the source code.

I have thought that TaskManager Process is like Spark's Executor Process, and JobManager is like Spark's Driver Process

But when I looks at the diagram https://learning.oreilly.com/library/view/stream-processing-with/9781491974285/assets/components.png

It looks that my thought is wrong? It is the Task Slot process that runs in the YARN container? That is, Task Slot is similar to Spark's executor process.

I don't have a good or clear understanding about JobManager、TaskManager and TaskSlot

1

1 Answers

4
votes

The Flink documentation explains how the distributed runtime is organized. To summarize roughly, in comparison to Spark:

  • Task Manager: Spark Worker
  • Task Slot: Spark Task
  • Application: Spark Driver Program

As for Flink's Job Manager, until fairly recently (Flink 1.6) this was a monolith playing many roles. Now that it has been refactored it remains responsible for most cluster-wide concerns that are independent of the cluster framework -- meaning things like coordinating checkpoints and recovery, and scheduling.

See also the answers to this question.