For example, I currently have a DataProc cluster consisting of a master and 4 workers, each machine has 8 vCPUs and 30GB memory.
Whenever I submit a job to the cluster, the cluster commits a max of 11GB total, and only engages 2 worker nodes to do the work, and on those nodes only uses 2 of the vCPU resources. This makes a job that should only take a few minutes take nearly an hour to execute.
I have tried editing the spark-defaults.conf
file on the master node, and have tried running my spark-submit
command with the arguments --executor-cores 4 --executor-memory 20g --num-executors 4
but neither has had any effect.
These clusters will only be spun up to perform a single task and will then be torn down, so the resources do not need to be held for any other jobs.