I am running a Spark application with 5 executors with 5 cores per executor. However, I have noticed that only a single executor does most of the work (i.e most of the tasks are done there). The jobs that I am running are highly parallel (20 partitions or greater). How do you explain this behavior?
Even if I decrease the number of cores per executor, results to just running less tasks on that single executor at the same time. Should I limit the memory per executor so that more executors are used (just in case the whole data fits on a single executor)?