2
votes

I'm using hadoop-2.7.2 and oozie-4.0.1, what should be the jobTracker value in job.properties file of oozie workflow. I referred this link;

http://hadooptutorial.info/apache-oozie-installation-on-ubuntu-14-04/

which states that, in YARN architecture the job tracker runs on 8032 port and i'm currently using this. But in mapred-site.xml of hadoop i'm having the value hdfs://localhost:54311 for job tracker property.

I'm confused, can any one explain me or provide some useful links for installing oozie and running jobs on oozie.

Currently, i'm not able to run workflow jobs on oozie, it is in a Running state for a long time and then it is getting suspended with a connection error. Job DAG is also not getting generated, it is throwing some UI Exception.

Please anyone help me with this.

2
Check the value of the yarn.resourcemanager.address in the yarn and use that. It will be just host:port.YoungHobbit
Hi thanks, but where can we find this property(yarn.resourcemanager.address). I'm not having this property in yarn-site.xmlkarthi
go to conf directory and do a grep. grep -ri "yarn.resourcemanager.address" *YoungHobbit
I am able to find this into hadoop/conf/yarn-site.xml file.YoungHobbit
I have not given this property inside yarn-site.xml. is the default value is 8032 for yarn.resourcemanager.address.karthi

2 Answers

2
votes

In your properties file just pass the Resorucemanager address which you have configured in the yarn-site.xml or directly parse the resourcemanager address in workflow.xml file as

        <job-tracker>localhost:8032</job-tracker>

While running properties file you need to specify in which host the oozie server will be running, I think in that part you didn't face any issues right. Then paste the error message and update the question.

EDITED: Configurations needed to be in yarn-site.xml

    <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
    </property>
    <property>
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
    <description>NM Webapp address.</description>
    <name>yarn.nodemanager.webapp.address</name>
    <value>${yarn.nodemanager.hostname}:8042</value>
  </property>
  <property>
    <description>hostname </description>
    <name>yarn.nodemanager.hostname</name>
    <value>localhost</value>
  </property>

you can either specify hostname or localhost for Pesudo node cluster. for HA cluster need the below

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

1
votes

in Production Environment , probably you have configured a High-Availbility yarn cluster. In this case , the oozie job tracker config in job.properties should be the configuration value of yarn.resourcemanager.cluster-id.

a cut of my yarn configuration :

 <property>
                <name>yarn.resourcemanager.ha.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>yarn.resourcemanager.cluster-id</name>
                <value>datayarn</value>
        </property>
        <property>
                <name>yarn.resourcemanager.ha.rm-ids</name>
                <value>resourcemanager1,resourcemanager2</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname.resourcemanager1</name>
                <value>11.11.11.11</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname.resourcemanager2</name>
                <value>11.11.11.12</value>
        </property>

So , the jobTracker value should be:datayarn