We have two node hadoop yarn cluster, it is hadoop 2.2, on which We have scheduled two actions in a single workflow using oozie, the first action is a python map-reduce streaming action, and the second one is sqoop export job, which actually transfers the output of map-reduce streaming action to mysql database.
The streaming action successfully gets executed,which results in the starting of sqoop job, which remains running for ever.
stdout results in the following.
Sqoop command arguments :
export
--connect
jdbc:mysql://localhost/database
--username
root
--password
root
--table
tableName
--direct
--export-dir
/user/hduser/oozieProject/workflow/output
=================================================================
Invoking Sqoop command line now >>>
2137 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2158 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.4.2.0.6.1-102
2170 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
2178 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2197 [main] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset.
2197 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
2464 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `missedCalls` AS t LIMIT 1
2483 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `missedCalls` AS t LIMIT 1
2485 [main] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /usr/local/hadoop
3838 [main] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-hduser/compile/21bd1d5fe13adeed4f46a09f8b3d38fe/missedCalls.jar
3847 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Beginning export of missedCalls
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
the job properties is as follows
nameNode=hdfs://master:54310
jobTracker=master:8035
queueName=default
oozie.libpath=${nameNode}/user/hduser/share/lib
oozie.use.system.libpath=true
oozie.wf.rerun.failnodes=true
oozieProjectRoot=${nameNode}/user/hduser/oozieProject
appPath=${oozieProjectRoot}/workflow
oozie.wf.application.path=${appPath}
oozieLibPath=${oozie.libpath}
mapred.tasktracker.map.tasks.maximum=4
mapred.tasktracker.reduce.tasks.maximum=4
inputDir=${oozieProjectRoot}/data/*
outputDir=${appPath}/output
the workflow xml is as follows
<!--Oozie workflow file: workflow.xml -->
<workflow-app name="WorkflowStreamingMRAction-Python" xmlns="uri:oozie:workflow:0.1">
<start to="streamingaAction"/>
<action name="streamingaAction">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${outputDir}"/>
</prepare>
<streaming>
<mapper>python mapper.py</mapper>
<reducer>python reducer.py</reducer>
</streaming>
<configuration>
<property>
<name>oozie.libpath</name>
<value>${oozieLibPath}/mapreduce-streaming</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>${inputDir}</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>${outputDir}</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>4</value>
</property>
</configuration>
<file>${appPath}/mapper.py#mapper.py</file>
<file>${appPath}/reducer.py#reducer.py</file>
</map-reduce>
<ok to="sqoopAction"/>
<error to="killJobAction"/>
</action>
<action name="sqoopAction">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>export --connect jdbc:mysql://localhost/database --username root --password myPwd --table tableName --direct --export-dir /user/hduser/oozieProject/workflow/output</command>
</sqoop>
<ok to="end"/>
<error to="killJobAction"/>
</action>
<kill name="killJobAction">
<message>"Killed job due to error: ${wf:errorMessage(wf:lastErrorNode())}"</message>
</kill>
<end name="end" />
Please advice what could have gone wrong?
Thank you