Dataproc is supposed to fit in two Executors per worker (or yarn NodeManager) with each one getting half the cores and half the memory. And it does work that way.
However, if we override a setting, say spark.yarn.executor.memoryOverhead=4096
then it only creates one Executor per worker. Half the cores and memory of the clusters are not utilized. And no matter how we play around with spark.executor.memory or spark.executor.cores, it still doesn't spin up enough executors to utilize all cluster resources.
How to make dataproc still create 2 executors per worker? The yarn overhead is deducted out of the executor memory, so it should still be able to fit in 2 executors, shouldn't it?