3
votes

What I know is YARN is introduced and it replaced JobTracker and TaskTracker.

I have seen is some Hadoop 2.6.0/2.7.0 installation tutorials and they are configuring mapreduce.framework.name as yarn and mapred.job.tracker property as local or host:port.

The description for mapred.job.tracker property is

"The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task."

My doubt is why are configuring it if we are using YARN , I mean JobTracker shouldn't be running right?

Forgive me if my question is dumb.

Edit: These are the tutorials I was talking about.

http://chaalpritam.blogspot.in/2015/01/hadoop-260-multi-node-cluster-setup-on.html

http://pingax.com/install-apache-hadoop-ubuntu-cluster-setup/

https://chawlasumit.wordpress.com/2015/03/09/install-a-multi-node-hadoop-cluster-on-ubuntu-14-04/

1
I don't think the first tutorial chaalpritam.blogspot.in/2015/01/… is using YARN at all. It explicitly removes mapreduce.framework.name property which will default to local and not yarnRubenLaguna

1 Answers

7
votes

This is just a guess, but either those tutorials talking about configuring the JobTracker in YARN are written by people who don't know what YARN is, or they set it in case you decide to stop working with YARN someday. You are right: the JobTracker and TaskTracker do not exist in YARN. You can add the properties if you want, but they will be ignored. New properties for each of the components replacing the JobTracker and the TaskTracker were added with YARN, such as yarn.resourcemanager.address to replace mapred.jobtracker.address.

If you list your Java processes when running Hadoop under YARN, you see no JobTrackeror TaskTracker:

10561 Jps
20605 NameNode
17176 DataNode
18521 ResourceManager
19625 NodeManager
18424 JobHistoryServer

You can read more about how YARN works here.