Suppose I have an RDD with 1,000 elements and 10 executors. Right now I parallelize the RDD with 10 partitions and process 100 elements by each executor (assume 1 task per executor).
My difficulty is that some of these partitioned tasks may take much longer than others, so say 8 executors will be done quickly, while the remaining 2 will be stuck doing something for longer. So the master process will be waiting for the 2 to finished before moving on, and 8 will be idling.
What would be a way to make the idling executors 'take' some work from the busy ones? Unfortunately I can't anticipate ahead of time which ones will end up 'busier' than others, so can't balance the RDD properly ahead of time.
Can I somehow make executors communicate with each other programmatically? I was thinking of sharing a DataFrame with the executors, but based on what I see I cannot manipulate a DataFrame inside an executor?
I am using Spark 2.2.1 and JAVA