I am trying to execute a Map-Reduce task in an Oozie workflow using a <java>
action.
O'Reilley's Apache Oozie (Islam and Srinivasan 2015) notes that:
While itβs not recommended, Java action can be used to run Hadoop MapReduce jobs because MapReduce jobs are nothing but Java programs after all. The main class invoked can be a Hadoop MapReduce driver and can call Hadoop APIs to run a MapReduce job. In that mode, Hadoop spawns more mappers and reducers as required and runs them on the cluster.
However, I'm not having success using this approach.
The action definition in the workflow looks like this:
<java>
<!-- Namenode etc. in global configuration -->
<prepare>
<delete path="${transformOut}" />
</prepare>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>default</value>
</property>
</configuration>
<main-class>package.containing.TransformTool</main-class>
<arg>${transformIn}</arg>
<arg>${transformOut}</arg>
<file>${avroJar}</file>
<file>${avroMapReduceJar}</file>
</java>
The Tool implementation's main()
implementation looks like this:
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new TransformTool(), args);
if (res != 0) {
throw new Exception("Error running MapReduce.");
}
}
The workflow will crash with the "Error running MapReduce" exception above every time; how do I get the output of the MapReduce to diagnose the problem? Is there a problem with using this Tool
to run a MapReduce application? Am I using the wrong API calls?
I am extremely disinclined to use the Oozie <map-reduce>
action, as each action in the workflow relies on several separately versioned AVRO schemas.
What's the issue here? I am using the 'new' mapreduce
API for the task.
Thanks for any help.
mapreduce.job.queuename
for a "launcher" action (i.e.Java, Shell, Sqoop... anything but MapReduce) >> it will be propagated to your child MapReduce job, if any, but not used for the "launcher" job itself; you should also setoozie.launcher.mapreduce.job.queuename
for that one. And they can be different, e.g. a high-priority queue for launchers and default queue for heavy-duty child MR. β Samson Scharfrichter