2
votes

Has anyone been able to add more than the default queue to yarn on Spark 2.x in Dataproc?

Attempts that fail at cluster creation time:

capacity-scheduler:yarn.scheduler.capacity.root.queues=alpha,beta,default yarn:yarn.scheduler.capacity.root.queues=alpha,beta,default

Additionally, setting yarn.scheduler.fair.allow-undeclared-pools=true on either of the above configuration prefixes to activate dynamic queues also fails.

All cases seem to make the daemon fail leaving the Resource Manager dead on launch.

1

1 Answers

6
votes

Each queue needs to have a capacity specified. The properties needed for your example are as follow:

capacity-scheduler:yarn.scheduler.capacity.root.queues=alpha,beta,default
capacity-scheduler:yarn.scheduler.capacity.root.alpha.capacity=20
capacity-scheduler:yarn.scheduler.capacity.root.beta.capacity=20
capacity-scheduler:yarn.scheduler.capacity.root.default.capacity=60

Where all capacities specified sum to 100% of the root queues resources. The full set of configuration options for the capacity scheduler can be found in Hadoop documentation.