I am running a number of Spark jobs in parallel on a YARN cluster. I am finding that YARN is starting up a number of these jobs in parallel, but only allocating one container for the driver and no executors. This means that these Spark jobs are effectively sitting idle waiting for an executor to join, when this processing power could be better utilised by allocating executors to other jobs.
I would like to configure YARN to allocate a minimum of two containers (one driver + one executor) to a job, and if that's not available to keep it in the queue. How can I configure YARN in this way?
(I am running on an AWS EMR cluster with nearly all of the default settings.)