I am interested to know the following nitty gritties of spark parallelism and Partitioning
- How many partitions can a executor hold in spark?
- How are the partitions distributed (mechanism) among the executors?
- How to set the size of the partition. Would like to know the relevant the config parameter.
- Does executor store all the partitions in memory? If not when spilled to disk does it spill entire partition to disk or a part of partition to disk? 5 When there are 2 cores per executor but there are 5 partition in that executor then