How to set executor number by memory in YARN mode?

Question

I did some testing on r3.8 xlarge cluster, each instance has 32 cores, and 244G memory.

If I set spark.executor.cores=16, spark.executor.memory=94G, there're 2 executors per instance, but when I set spark.executor.memory larger than 94G, there will be only one executor per instance;

If I set spark.executor.cores=8, spark.executor.memory=35G, there're 4 executors per instance, but when I set spark.executor.memory larger than 35, there will be no larger than 3 executors per instance.

So, my question is, how does the executor number come out by memory set? What's the formula? I though the Spark just simply use 70% of the physical memory to allocate to the executors but seems I'm wrong...

None None · Accepted Answer · 2016-02-11T21:47:21

In Yarn mode you need to set number of executor by num-executors and executor memory by executor-memory. Here's a example:

spark-submit --master yarn-cluster --executor-memory 6G --num-executors 31 --executor-cores 32 example.jar Example

Now each executor requests a container from yarn with 6G + memory overhead and 1 core.

How to set executor number by memory in YARN mode?

2 Answers