2
votes

Can someone pls tell what is the differece between MR1 and yarn and MR2

My understanding is MR1 will be having below components Namenode, secondary name node, datanode, job tracker, task tracker

Yarn Node manager Resource Manager

Is Yarn consists of MR1 or MR2 ( or both MR2 and Yarn are same?)

sorry if i asked basic level question

2

2 Answers

4
votes

MRv1 uses the JobTracker to create and assign tasks to task trackers, which can become a resource bottleneck when the cluster scales out far enough (usually around 4,000 clusters).

MRv2 (aka YARN, "Yet Another Resource Negotiator") has a Resource Manager for each cluster, and each data node runs a Node Manager. In MapReduce MRv2, the functions of the JobTracker have been split between three services. The ResourceManager is a persistent YARN service that receives and runs applications (a MapReduce job is an application) on the cluster. It contains the scheduler, which, as previously, is pluggable. The MapReduce-specific capabilities of the JobTracker have been moved into the MapReduce Application Master, one of which is started to manage each MapReduce job and terminated when the job completes. The JobTracker function of serving information about completed jobs has been moved to the JobHistory Server. The TaskTracker has been replaced with the NodeManager, a YARN service that manages resources and deployment on a host. It is responsible for launching containers, each of which can house a map or reduce task.

1
votes

YARN is a generic platform for any form of distributed application to run on, while MR2 is one such distributed application that runs the MapReduce framework on top of YARN