1
votes

can I write a sqoop import command in a script and excute it in oozie as coordinator workflow?

I have tired to do so and found an error saying sqoop command not found even if i give the absolute path for sqoop to execute

script.sh is as follows

sqoop import --connect 'jdbc:sqlserver://xx.xx.xx.xx' -username=sa -password -table materials --fields-terminated-by '^' -- --schema dbo -target-dir /user/hadoop/CFFC/oozie_materials

and I have placed the file in HDFS and gave oozie its path.The workflow is as follows :

<workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'>
<start to='shell1' />
<action name='shell1'>
    <shell xmlns="uri:oozie:shell-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
              <name>mapred.job.queue.name</name>
              <value>${queueName}</value>
            </property>
        </configuration>
        <exec>script.sh</exec>
        <file>script.sh#script.sh</file>
    </shell>
    <ok to="end" />
    <error to="fail" />
</action>
<kill name="fail">
    <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name='end' />

the oozie returns an error as sqoop command not found in the mapreduce log.

so is that a good practice?

Thanks

2

2 Answers

0
votes

The shell action will be running as a mapper task as you have observed. The sqoop command needs to be present on each data node where the mapper is running. If you make sure sqoop command line is there and has proper permission for the user who submitted the job, it should work.

The way to verify could be :

  • ssh to datanode as specific user
  • run command line sqoop to see if it works
0
votes

try to add sqljdbc41.jar sqlserver driver to your HDFS and add archive tag in your workflow.xml as below and then try to run oozie workflow run command:

<archive>${HDFSAPATH}/sqljdbc41.jar#sqljdbc41.jar</archive>

If problem exists then..add hive-site.xml with below properties,

javax.jdo.option.ConnectionURL
hive.metastore.uris

Keep hive-site.xml in HDFS, and add file tag in workflow.xml and restart oozie workflow.xml