I've read many docs about this issue but I'm still confused by the two concepts: slot and task.
Let's see the example of WordCount.
As my understanding, each yellow circle is an operator, and Flink can do some optimization, meaning that it can merge more than one operator into an operator chain. In this example, Source
and map()
can be merged so it becomes as below:
The whole stream becomes three tasks: Source + map()
, KeyBy()/window()/apply()
and Slink
.
If I'm right,one slot is one thread in the TaskManager of Flink, so I'm confused now. In this example, we have three tasks, so does it mean that we must have three slots (each task has its own thread), or does it mean that we must create a TaskManager with three slots for this example? What if the TaskManager has only one or two slots? If we have less than three slots, some exception will be thrown?