2
votes

I am running Spark over Yarn on a 4 Node Cluster. The configuration of each machine in the node is 128GB Memory, 24 Core CPU per node. I run Spark on using this command

spark-shell --master yarn --num-executors 19 --executor-memory 18g --executor-cores 4 --driver-memory 4g

But Spark only launches 16 executors maximum. I have maximum-vcore allocation in yarn set to 80 (out of the 94 cores i have). So i was under the impression that this will launch 19 executors but it can only go upto 16 executors. Also I don't think even these executors are using the allocated VCores completely.

These are my questions

  1. Why isn't spark creating 19 executors. Is there a computation behind the scenes that's limiting it?
  2. What is the optimal configuration to run spark-shell given my cluster configuration, if I wanted to get the best possible spark performance
  3. driver-core is set to 1 by default. Will increasing it improve performance.

Here is my Yarn Config

  • yarn.nodemanager.resource.memory-mb: 106496
  • yarn..minimum-allocation-mb: 3584
  • yarn..maximum-allocation-mb: 106496
  • yarn..minimum-allocation-vcores: 1
  • yarn..maximum-allocation-vcores: 20
  • yarn.nodemanager.resource.cpu-vcores: 20
2
Can you tell me what your executor overhead is set to? you will need to make the computation including your executor memory overhead. - Adi Kish

2 Answers

2
votes

Ok so going by your configurations we have: (I am also a newbie at Spark but below is what I speculate in this scenario) 24 cores and 128GB ram per node and we have 4 nodes in the cluster.

We allocate 1 core and 1 GB memory for overhead and considering you're running your cluster in YARN-Client mode.

We have 127GB Ram and 23 Cores left with us in 4 nodes.

As mentioned in Cloudera blog YARN runs at optimal performance when 5 cores are allocated per executor at max.

So, 23X4 = 92 Cores. If we allocated 5 cores per executor then 18 executor have 5 cores and 1 executor has 2 cores or likewise. So lets assume we have 18 executor in our application and 5 cores per executor. Spark distributes these 18 executors across 4 nodes. suppose its distributed as: 1st node : 4 executors 2nd node : 4 executors 3rd node : 5 executors 4th node : 5 executors

Now, as 'yarn.nodemanager.resource.memory-mb: 106496' is set as 104GB in your configurations, each node can have max 104 GB memory allocated (I would suggest increasing this parameter). For nodes with 4 executors: 104/4 - 26GB per executor For nodes with 5 executors: 104/5 ~ 21GB per executor. Now leaving out 7% memory for overhead we get 24GB and 20GB.

So i would suggest using following configurations:- --num-executors : 18 --executor-memory : 20G --executor-cores : 5

Also, This is considering that you're running your cluster in client mode but if you run your cluster in Yarn-cluster mode 1 node will be allocated fir driver program and the calculations will need to be done differently.

0
votes

I still cannot comment, so it will be as an answer.

See this question. Could you please decrease executor memory and try run this again?