I am on CDH 5.7.0 and I could see a strange issue with spark 2 running on YARN cluster. Hereunder is my job submit command
spark2-submit --master yarn --deploy-mode cluster --conf "spark.executor.instances=8" --conf "spark.executor.cores=4" --conf "spark.executor.memory=8g" --conf "spark.driver.cores=4" --conf "spark.driver.memory=8g" --class com.learning.Trigger learning-1.0.jar
Even though I have limited the number of cluster resources my job can use, I could see the resource utilization is more than the allocated amount.
The job starts with basic memory consumption like 8G of memory and would eat us the whole cluster.
I do not have dynamic allocation set to true.
I am just triggering an INSERT OVERWRITE query on top of SparkSession.
Any pointers would be very helpful.