0
votes

Suppose I have an RDD with 1,000 elements and 10 executors. Right now I parallelize the RDD with 10 partitions and process 100 elements by each executor (assume 1 task per executor).

My difficulty is that some of these partitioned tasks may take much longer than others, so say 8 executors will be done quickly, while the remaining 2 will be stuck doing something for longer. So the master process will be waiting for the 2 to finished before moving on, and 8 will be idling.

What would be a way to make the idling executors 'take' some work from the busy ones? Unfortunately I can't anticipate ahead of time which ones will end up 'busier' than others, so can't balance the RDD properly ahead of time.

Can I somehow make executors communicate with each other programmatically? I was thinking of sharing a DataFrame with the executors, but based on what I see I cannot manipulate a DataFrame inside an executor?

I am using Spark 2.2.1 and JAVA

1

1 Answers

0
votes

Try using spark dynamic resource allocation, which scales the number of executors registered with the application up and down based on the workload.

You can endable the below properties

spark.dynamicAllocation.enabled = true
spark.shuffle.service.enabled = true

You can consider to configure the below properties as well

spark.dynamicAllocation.executorIdleTimeout
spark.dynamicAllocation.maxExecutors
spark.dynamicAllocation.minExecutors

Spark provides a mechanism to dynamically adjust the resources your application occupies based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand. This feature is particularly useful if multiple applications share resources in your Spark cluster.