0
votes

In Flink, operators like 'flatMap', 'map' and such are called Task, if I set parallelism of flatMap to 30, then this Task has 30 SubTasks.

Now, if I have only 1 slot, will it result in multiple threads in one slot? Or still only one thread per slot?

Will Flink simply create 30 threads in that slot, or it uses something like thread pool?

EDIT:

Above is not an appropriate example.

Let's say in the Job I have operator flatMap and map, they both have parallelism 1, and I have only one slot, will this slot has 2 threads created? (chaining disabled)

1

1 Answers

1
votes

Each task slot can contain multiple subtasks (operators). However, a slot can't contain multiple instances of the same operator - i.e. it can share different flatMap, map, etc. subtasks, but not a couple of same maps. Chained operators are executed within the same thread, meanwhile not chained - in different threads.

enter image description here

So, in your case if you specify the paralelism of flatMap to 30, flink will require to have 30 slots to assign these subtasks to. It won't be able to assign all 30 subtasks to a single slot.

However, what you can do is to specify the number of slots per task manager via taskmanager.numberOfTaskSlots. So you can acquire a container with only 1 vcore but assign 30 slots to it, thus getting 30 separate threads and assign subtasks to a single vcore.

Please, refer to the Task slots and resources section of Flink official documentation.

Update: as @DavidAnderson noted,

a task slot may have many threads. Each subtask is executed in a separate JVM thread, and a task slot is basically a thread pool.