2
votes

I am submitting spark application on YARN with following configs

conf.set("spark.executor.cores", "3")
conf.set("spark.executor.memory", "14g")
conf.set("spark.executor.instances", "4")
conf.set("spark.driver.cores", "5")
conf.set("spark.driver.memory", "1g")

But, On YARN Resource manager UI it's showing vCores used = 5, I am expecting vCores used to be 17 ((4x3)+5=17) i.e 12 for executor and 5 for driver. but it's always showing equal to executors+driver=5.

Please help me understand this! Thanks in advance

1

1 Answers

0
votes

In the spark configuration docs you'll see the following:

Spark properties mainly can be divided into two kinds: one is related to deploy, like “spark.driver.memory”, “spark.executor.instances”, this kind of properties may not be affected when setting programmatically through SparkConf in runtime, or the behavior is depending on which cluster manager and deploy mode you choose, so it would be suggested to set through configuration file or spark-submit command line options; another is mainly related to Spark runtime control, like “spark.task.maxFailures”, this kind of properties can be set in either way.

Most of those settings you'll want to set from your spark-submit command line, as opposed to in the code. This is typically a better practice anyway, so that you can launch the job with different parameters, without having to re-compile it.

You'd want something like:

spark-submit --num-executors 4 --executor-cores 3 --executor-memory 14g --driver-memory 1g --driver-cores 5 --class <main_class> <your_jar>