2
votes

I am bit confused on what does the term "MapReduce" with respect to Hadoop 1.x. With respect to this, I come across various terms like: JobTracker , TaskTracker (the daemons in MapReduce). Now when we say MapReduce does it refer to these daemons or the API which a developer uses to code MapReduce applications?

Does the user application execute on TaskTracker , JobTracker? Is MapReduce itself a run-time environment?

Can anyone please help me understand this in simple words?

1

1 Answers

2
votes

MapReduce is the programming model for data processing (in Hadoop).

Its implementation in Hadoop-1.x is often referred as the Classic MapReduce Implementation (or MapReduce v1) which uses JobTracker and TaskTrackers of Hadoop for the execution of Jobs and its corresponding APIs (user-facing client-side features) for writing them.

  • JobTracker coordinates the Job run.
  • TaskTrackers run the tasks that the job has been split into.

To sum up, the MapReduce APIs determine how the MapReduce programming model has to be written whereas the Implementation determine how the Job written using this programming model is executed.

The YARN implementation (MapReduce v2) of MapReduce programming model differs in its APIs used for writing it and the daemons (ResourceManager, ApplicationMaster and NodeManagers) used for execution.