I had a couple of questions regarding job submission to HDFS and the YARN architecture in Hadoop:
So in the Hadoop ecosystem you have one NameNode for each cluster which can contain any number of data nodes that store your data. When you submit a job to Hadoop, the job tracker on the NameNode will pick each job and assign it to the task tracker on which the file is present on the data node.
So my question is how do the components of YARN work together in HDFS:?
So YARN consists of the NodeManager and the Resource Manager. Out of these two components: Is the NodeManager run on every DataNode and the ResourceManager runs on each NameNode for each cluster? So when the task tracker (in each DataNode) gets assigned a task from the job tracker (in the NameNode), the NodeManager in a specific data node will create an container which will request resources from the ResourceManager in the NameNode. So this resource manager and node manager only come into play when a task tracker in a data node gets a job from the job tracker in the NameNode, in which the NodeManager will ask the ResourceManager for resources for the job to be executed. Is this correct?