1
votes

is there a mapping/translation for the number of hardware systems, cpu cores and their associated memory to the spark-submit tunables of: executor-memory executor-cores num-executors The application is certaionly bound to have something to do with these tunables, I am however looking for a "basic rule of thumb" Apache spark is running on yarn with hdfs in cluster mode. Not all the hardware systems in the spark/hadoop yarn cluster have the same number of cpu cores or RAM.

1
I think the general idea is to oversubscribe to resources on the cluster and let the Spark driver determine the best configuration.Saif Charaniya
How about a mapping of the spark tunables with respect to cpu cores and RAM?user5191140
I don't think there's much of a mapping available, but there are preferred hardware provisions: spark.apache.org/docs/latest/hardware-provisioning.htmlSaif Charaniya

1 Answers

0
votes

There is no thumb rule, but after considering

  1. off heap memory
  2. Number of applications and other hadoop dameons running
  3. Resource manager needs
  4. HDFS IO

etc.

You can derive a suitable configuration. Please check this url