1
votes

(Solved)I want to contact hadoop cluster and get some job/task information.

In hadoop1, I was able to use JobClient ( local pesudo distributed mode, use Eclipse):

JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",9001),new JobConf(config));
JobID job_id = JobID.forName("job_xxxxxx");
RunningJob job = jobClient.getJob(job_id);
.....

Today I set up a pesudo distributed hadoop2 YARN cluster, however, the above code doesn't work. I use the port of resource manager(8032).

JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",8032),new JobConf(config));

This line gives exception: Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

I search this exception but all solutions are not working. I use eclipse, and I have add all hadoop jars including hadoop-mapreduce-client-xxx. Also, I can successfully run example programs on my cluster. Any suggestions on how to use JobClient on hadoop2 yarn?

Update: I am able to solve this issue by compile with the same hadoop lib as the rm server. In Eclipse it still gives this exception but after I compiled and deployed my project it works fine.(not sure why as in hadoop1 it works in eclipse) There is no need to change the api, JobClient is still functioning well in hadoop2

1

1 Answers

3
votes

Have you configured the mapred-site.xml file as followed? It is located in $HADOOP_HOME/etc/hadoop/ in hadoop 2.x

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

edit: Also make sure that your yarn-site.xml (same location) contains the following property:

<property>
    <name>yarn.resourcemanager.address</name>
    <value>host:port</value>
</property>

One last thing: I strongly advise you to work with hostnames instead of IPs. There are known cases of failure with hadoop when IPs are set in the configuration files.