2
votes

In YARN, the application master requests the resource manager for the resources, so that the containers for that application can be launched.

  1. Does the application master wait for all the resources to be allocated before it even launches the first container or it request for each and every container and as and when it obtains the resource for a container, it starts launching that specific container? i.e.What about the situation when only part of the resources are available? Does it wait for the resource to be freed? or proceed based on the available resources?

  2. How does the MR application master decides the resource requirement for an MR job? Does the YARN MR client determine this and sends it to AM or the AM finds it? If so, what is this based on? I believe that this is configurable but i may be talking about the default case when the memory, CPU are not provided.

1

1 Answers

1
votes

No, the AM does not wait for all resources to be allocated. Instead it schedules / launches containers as resources are given to it by the resource manager.

The size requested for each container is defined in job configuration when the job is created by the driver. If values were not set explicitly for the job, values from mapred-site and mapred-default are used (see https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml) for default values of mapreduce.map.memory.mb, mapreduce.reduce.memory.mb mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores. How these values get translated into resources granted is a bit complicated and based on scheduler being used, minimum container allocation settings, etc.

I don't know for certain if there's a maximum number of containers that the MR app master will request other than (# of input splits for mappers) + (number of reducers). The MR app master will release containers when it is done with them (e.g., if you have 1,000 mapper containers but only 20 reducers it will release the other 980 containers once they are no longer needed).