0
votes

I am new to CDH4 Oozie workflow editor. While trying to invoke a pig script from Oozie workflow editor, i am getting the following error.

HadoopAccessorException: E0900: Jobtracker [mymachine:8032] not allowed, not in Oozies whitelist

It looks like Oozie is submitting the job to Yarn port (8032). I want it to submit to 8021 (MR jobtracker) port. Can someone help me in identify where to set the job tracker URL or port so that oozie picks up the correct one (using Hue or Cloudera manager).

Previously I tried the following but none of them helped

  1. Modfied workflow.xml file /user/hue/oozie/workspaces/../workflow.xml file. However it gets overwritten when I submit the job from workflow editor.

  2. In cloudera Manager --> oozie --> configuration -->Oozie Server (advanced) --> Oozie Server Configuration Safety Valve for oozie-site.xml property I set the following-

    <property>
        <name>oozie.service.HadoopAccessorService.nameNode.whitelist</name>
    <value>mymachine:8020</value>
    

    oozie.service.HadoopAccessorService.jobTracker.whitelist mymachine:8021

and restarted the oozie service. 3. Tried to override 'jobTracker' property while configuring the pig task. This appears as follows in the workflow file however it doesn't take effect (or doesn't override) and still uses 8032 port.

<global>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>mymachine:8021</value>
                </property>
            </configuration>
 </global>

I am using CDH4 version.

Thanks for looking into my question.

2

2 Answers

1
votes

If it is using 8032 this means that Hue is configured for Yarn/MR2. Do you have any warnings on the /about page? Are you sure you are not using MR2 instead of MR1?

More info

On my setup I just leave it blank:

<property>
    <name>oozie.service.HadoopAccessorService.jobTracker.whitelist</name>
    <value> </value>
    <description>
        Whitelisted job tracker for Oozie service.
    </description>
</property>
0
votes

I had the same issue with our new cloudera cluster and with my local hadoop box running in vmware. What I did - I looked into Cloudera Manager to find host name of mapreduce1 service and put it into jobTracker property in my workflow.properties which is used by

<action name="report">
  <java>
    <job-tracker>${jobTracker}</job-tracker>
    ...
  </java>
</action>

In case of my local hadoop box in vmware the correct value was localhost.localdomain