I have a workflow constructed in Flink that consists of a custom source, a series of maps/flatmaps and a sink.
The run() method of my custom source iterates through the files stored in a folder and collects, through the collect() method of the context, the name and the contents of each file (I have a custom object that stores this info in two fields).
I then have a series of maps/flatmaps transforming such objects which are then printed into files using a custom sink. The execution graph as this is produced in the Flink's Web UI is the following:
I have a cluster or 2 workers setup to have 6 slots each (they both have 6 cores, too). I set the parallelism to 12. From the execution graph I see that the parallelism of the source is 1, while the rest of the workflow has parallelism 12.
When I run the workflow (I have around 15K files in the dedicated folder) I monitor, using htop, the resources of my workers. All the cores reach up to 100% utilisation for most of the time but every roughly 30 minutes or so, 8-10 of the cores become idle for about 2-3 minutes.
My questions are the following:
I understand that the source runs having parallelism 1 which I believe is normal when reading from a local storage (my files are located into the same directory in each worker as I don't know which worker will be selected to execute the source). Is it normal indeed? Could you please explain why this is the case?
The rest of my workflow is executed having parallelism 12 which looks to be correct as by checking the task managers' logs I get prints from all the slots (e.g.,
.... [Flat Map -> Map -> Map -> Sink: Unnamed (**3/12**)] INFO ....
,.... [Flat Map -> Map -> Map -> Sink: Unnamed (**5/12**)] INFO ....
, etc.)). What I don't understand though is if one slot is executing the source role and I have 12 slots in my cluster, how is the rest of the workflow executed by 12 slots? Is one slot acting for both the source and one instance of the rest of the workflow? If yes, how are the resources for this specific slot allocated? Would it be possible for someone to explain the steps undergoing in this workflow? For example (this might be wrong):
- Slot 1 reads files and forwards them to available slots (2 to 12)
- Slot 1 forwards one file to itself and stops reading until it finishes its job
- When done, slot 1 reads more files and forwards them to slots that became available
I believe what I describe above is wrong but I give it as an example to better explain my question
- Why I have this idle state for the majority of the cores every 30 minutes (more or less) that lasts for about 3 minutes?