1
votes

Yarn resource manager is not showing the total cores for the spark application. For example if we are submiting a spark job with 300 executors and executor-cores is 3. So ideally spark having 900 cores but in yarn resource manager only showing 300 cores.

So is this just a display error or is Yarn not seeing the rest of the 600 cores?

Environment: HDP2.2 Scheduler : capacity-scheduler Spark : 1.4.1

1

1 Answers

6
votes

Set

yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

in capacity-scheduler.xml

YARN is running more containers than allocated cores because by default DefaultResourceCalculator is used. It considers only memory.

public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
  }

Use DominantResourceCalculator, It uses both cpu and memory.

you can read more about DominantResourceCalculator here.