I am trying to find out the best way to configure the memory on the nodes of my cluster. However I believe that for that there some things that I need to further understand such as how spark handles memory across tasks.
For example, let's say I have 3 executors, each executor can run up to 8 tasks in parallel (i.e. 8 cores). If I have an RDD with 24 partitions, this means that theoretically all partitions can be processed in parallel. However if we zoom into one executor here, this assumes that each task can have their partition in memory for operating on them. If not then it means that 8 tasks in parallel will not happen and some scheduling will be required.
Hence I concluded that If one sought true parallelism having some idea about the size of your partitions would be helpful as it would inform you as to how to size your executor for true parallelism.
Q0 - I simply want to understand a little better, what happens when not all the partitions can fit in memory within one executor? Are some spilled on disk while others are in memory for operations? Does spark
reserve the memory for each task, and if it detects that there is not enough, does it schedule the tasks? Or do one simply run in an out
of memory error.Q1 - Does true parallelism within an executor depend also on the amount of memory available on the executor? In other words, I may have 8 cores in my cluster, but if I do not have enough memory to load 8 partitions of my data at once, then I won't be fully parallel.
As a final note I have seen several times the following statement but find it confusing a little:
“Increasing the number of partitions can also help reduce out-of-memory errors, since it means that Spark will operate on a smaller subset of the data for each executor.”
How does that work exactly ? I mean spark may work on smaller subset but if the total set of partitions can't fit in memory anyway, what happens?