What happens if total parallel instances of operators are higher than parallelism of the flink system?
Here is the scenario:
- Let's say I have a standalone flink application with 1 JobManager and 1 TaskManager(has 5 CPU)
- I have setup the
taskmanager.numberOfTaskSlots=5
andparallelism.default=5
- There are 2 data sources(assume that two different kafka topics which each of them five partitions)
- Chaining strategy disabled for all operators
- Dataflow of my application (I have only 1 job which includes both two kafka sources):
kafkaSource1.map(Mapper1).sink(sink1);
kafkaSource2.map(Mapper2).sink(sink1);
After deploying this dataflow with 5 parallelism, will TaskManager suffer from overload?
As far as my understanding, Tasks will be spreaded to the TaskManager's slots like this one:
- If this is correct diagram, in this diagram each slots have 2 different operators's instances. How it will work? It will work parallel or sequencial manner(first kafka1->map1->sink1, then kafka2->map2->sink1)
- If it is not correct, how it will work, how task will be spreaded to the slots?